A 1.97 TFLOPS/W Configurable SRAM-Based Floating-Point Computation-in-Memory Macro for Energy-Efficient AI Chips

被引:2
|
作者
Mai, Yangzhan [1 ]
Wang, Mingyu [1 ]
Zhang, Chuanghao [1 ]
Zhong, Baiqing [1 ]
Yu, Zhiyi [1 ]
机构
[1] Sun Yat Sen Univ, Sch Microelect Sci & Technol, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Computation-in-memory (CIM); Floating-point; SRAM; Parallel Alignment;
D O I
10.1109/ISCAS46773.2023.10182197
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Floating-point (FP) computation-in-memory (CIM) technology is increasingly demanded by low-power neural network training. In this work, we propose an energy-efficient configurable SRAM-based FP CIM macro. A mantissa parallel alignment method is proposed to improve calculation speed and accuracy in FP multiply-accumulation (MAC) operations. The separated mantissa CIM and exponent CIM are designed to enable pipelining of exponent and mantissa operations to increase computation throughput. Furthermore, the macro can be flexibly set to BF16 or FP32 precision by configuring accumulators. The proposed FP CIM macro is analyzed in 40 nm CMOS technology, and the estimated area is 0.48 mm(2). The simulation results show that the macro achieves a frequency of 294 MHz in 1.1 V. In BF16 mode, the macro can achieve a peak throughput of 56.5 GFLOPS and an energy efficiency of 1.97 TFLOPS/W while the peak throughput and energy efficiency are 16 GFLOPS and 0.62 TFLOPS/W in FP32 mode.
引用
收藏
页数:5
相关论文
共 26 条
  • [1] An SRAM-Based Hybrid Computation-in-Memory Macro Using Current-Reused Differential CCO
    Choi, Injun
    Choi, Edward Jongyoon
    Yi, Donghyeon
    Jung, Yoontae
    Seong, Hoyong
    Jeon, Hyuntak
    Kweon, Soon-Jae
    Chang, Ik-Joon
    Ha, Sohmyung
    Je, Minkyu
    [J]. IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2022, 12 (02) : 536 - 546
  • [2] STAR-SRAM: 43.06-TFLOPS/W, 1.89-TFLOPS/mm2, 400-Kb/mm2 Floating-Point SRAM-based Digital Computingin-Memory Macro in 28-nm CMOS
    Lin, Chuan-Tung
    Oh, Jonghyun
    Lee, Kevin
    Seok, Mingoo
    [J]. 2024 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE, CICC, 2024,
  • [3] A 28nm 1.644TFLOPS/W Floating-Point Computation SRAM Macro with Variable Precision for Deep Neural Network Inference and Training
    Jeong, Sangsu
    Park, Jeongwoo
    Jeon, Dongsuk
    [J]. ESSCIRC 2022- IEEE 48TH EUROPEAN SOLID STATE CIRCUITS CONFERENCE (ESSCIRC), 2022, : 145 - 148
  • [4] A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors
    Chiu, Yen-Cheng
    Zhang, Zhixiao
    Chen, Jia-Jing
    Si, Xin
    Liu, Ruhui
    Tu, Yung-Ning
    Su, Jian-Wei
    Huang, Wei-Hsing
    Wang, Jing-Hong
    Wei, Wei-Chen
    Hung, Je-Min
    Sheu, Shyh-Shyuan
    Li, Sih-Han
    Wu, Chih-I
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (10) : 2790 - 2801
  • [5] An Energy-Efficient Hybrid SRAM-Based In-Memory Computing Macro for Artificial Intelligence Edge Devices
    Rajput, Anil Kumar
    Tiwari, Alok Kumar
    Pattanaik, Manisha
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (06) : 3589 - 3616
  • [6] An Energy-Efficient Hybrid SRAM-Based In-Memory Computing Macro for Artificial Intelligence Edge Devices
    Anil Kumar Rajput
    Alok Kumar Tiwari
    Manisha Pattanaik
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 3589 - 3616
  • [7] A Floating-Point 6T SRAM In-Memory-Compute Macro Using Hybrid-Domain Structure for Advanced AI Edge Chips
    Wu, Ping-Chun
    Su, Jian-Wei
    Hong, Li-Yang
    Ren, Jin-Sheng
    Chien, Chih-Han
    Chen, Ho-Yu
    Ke, Chao-En
    Hsiao, Hsu-Ming
    Li, Sih-Han
    Sheu, Shyh-Shyuan
    Lo, Wei-Chung
    Chang, Shih-Chieh
    Lo, Chung-Chuan
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2024, 59 (01) : 196 - 207
  • [8] A 5.99 TFLOPS/W Heterogeneous CIM-NPU Architecture for an Energy Efficient Floating-Point DNN Acceleration
    Park, Wonhoon
    Ryu, Junha
    Kim, Sangjin
    Um, Soyeon
    Jo, Wooyoung
    Kim, Sangyoeb
    Yoo, Hoi-Jun
    [J]. 2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [9] An energy-efficient floating-point compute SRAM with pipelined in-memory bit-parallel exponent and bitwise mantissa processing
    Mai, Yangzhan
    Wang, Mingyu
    Zhong, Baiqing
    Zhang, Chuanghao
    Zhang, Yicong
    Yu, Zhiyi
    [J]. ELECTRONICS LETTERS, 2023, 59 (14)
  • [10] RIME: A Scalable and Energy-Efficient Processing-In-Memory Architecture for Floating-Point Operations
    Lu, Zhaojun
    Arafin, Md Tanvir
    Qu, Gang
    [J]. 2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 120 - 125