A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors

被引:62
|
作者
Chiu, Yen-Cheng [1 ]
Zhang, Zhixiao [2 ,3 ]
Chen, Jia-Jing [1 ]
Si, Xin [2 ,4 ]
Liu, Ruhui [1 ]
Tu, Yung-Ning [1 ]
Su, Jian-Wei [5 ]
Huang, Wei-Hsing [1 ]
Wang, Jing-Hong [1 ]
Wei, Wei-Chen [1 ]
Hung, Je-Min [1 ]
Sheu, Shyh-Shyuan [5 ]
Li, Sih-Han [5 ]
Wu, Chih-I [5 ]
Liu, Ren-Shuo [1 ]
Hsieh, Chih-Cheng [1 ]
Tang, Kea-Tiong [1 ]
Chang, Meng-Fan [1 ]
机构
[1] Natl Tsing Hua Univ, Inst Elect Engn, Hsinchu 30013, Taiwan
[2] Natl Tsing Hua Univ, Hsinchu 30013, Taiwan
[3] Fuzhou Univ, Microelect & Solid State Elect Dept, Fuzhou 350108, Peoples R China
[4] Univ Elect Sci & Technol China, Integrated Circuit Design & Integrat Syst Dept, Chengdu 611731, Peoples R China
[5] Ind Technol Res Inst, Hsinchu 31040, Taiwan
关键词
AI edge processor; CNN; computing-in-memory (CIM); SRAM;
D O I
10.1109/JSSC.2020.3005754
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Previous SRAM-based computing-in-memory (SRAM-CIM) macros suffer small read margins for high-precision operations, large cell array area overhead, and limited compatibility with many input and weight configurations. This work presents a 1-to-8-bit configurable SRAM CIM unit-macro using: 1) a hybrid structure combining 6T-SRAM based in-memory binary product-sum (PS) operations with digital near-memory-computing multibit PS accumulation to increase read accuracy and reduce area overhead; 2) column-based place-value-grouped weight mapping and a serial-bit input (SBIN) mapping scheme to facilitate reconfiguration and increase array efficiency under various input and weight configurations; 3) a self-reference multilevel reader (SRMLR) to reduce read-out energy and achieve a sensing margin 2x that of the midpoint reference scheme; and 4) an input-aware bitline voltage compensation scheme to ensure successful read operations across various input-weight patterns. A 4-Kb configurable 6T-SRAM CIM unit-macro was fabricated using a 55-nm CMOS process with foundry 6T-SRAM cells. The resulting macro achieved access times of 3.5 ns per cycle (pipeline) and energy efficiency of 0.6-40.2 TOPS/W under binary to 8-b input/8-b weight precision.
引用
收藏
页码:2790 / 2801
页数:12
相关论文
共 14 条
  • [1] A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors
    Si, Xin
    Chen, Jia-Jing
    Tu, Yung-Ning
    Huang, Wei-Hsing
    Wang, Jing-Hong
    Chiu, Yen-Cheng
    Wei, Wei-Chen
    Wu, Ssu-Yen
    Sun, Xiaoyu
    Liu, Rui
    Yu, Shimeng
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Li, Qiang
    Chang, Meng-Fan
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (01) : 189 - 202
  • [2] A 55nm 1-to-8 bit Configurable 6T SRAM based Computing-in-Memory Unit-Macro for CNN-based AI Edge Processors
    Zhang, Zhixiao
    Chen, Jia-Jing
    Si, Xin
    Tu, Yung-Ning
    Su, Jian-Wei
    Huang, Wei-Hsing
    Wang, Jing-Hong
    Wei, Wei-Chen
    Chiu, Yen-Cheng
    Hong, Je-Min
    Sheu, Shyh-Shyuan
    Li, Sih-Han
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. 2019 IEEE ASIAN SOLID-STATE CIRCUITS CONFERENCE (A-SSCC), 2019, : 217 - 218
  • [3] A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning
    Si, Xin
    Chen, Jia-Jing
    Tu, Yung-Ning
    Huang, Wei-Hsing
    Wang, Jing-Hong
    Chiu, Yen-Cheng
    Wei, Wei-Chen
    Wu, Ssu-Yen
    Sun, Xiaoyu
    Liu, Rui
    Yu, Shimeng
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Li, Qiang
    Chang, Meng-Fan
    [J]. 2019 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2019, 62 : 396 - +
  • [4] A Dual-Split 6T SRAM-Based Computing-in-Memory Unit-Macro With Fully Parallel Product-Sum Operation for Binarized DNN Edge Processors
    Si, Xin
    Khwa, Win-San
    Chen, Jia-Jing
    Li, Jia-Fang
    Sun, Xiaoyu
    Liu, Rui
    Yu, Shimeng
    Yamauchi, Hiroyuki
    Li, Qiang
    Chang, Meng-Fan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2019, 66 (11) : 4172 - 4185
  • [5] A 1.97 TFLOPS/W Configurable SRAM-Based Floating-Point Computation-in-Memory Macro for Energy-Efficient AI Chips
    Mai, Yangzhan
    Wang, Mingyu
    Zhang, Chuanghao
    Zhong, Baiqing
    Yu, Zhiyi
    [J]. 2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [6] A Local Computing Cell and 6T SRAM-Based Computing-in-Memory Macro With 8-b MAC Operation for Edge AI Chips
    Si, Xin
    Tu, Yung-Ning
    Huang, Wei-Hsing
    Su, Jian-Wei
    Lu, Pei-Jung
    Wang, Jing-Hong
    Liu, Ta-Wei
    Wu, Ssu-Yen
    Liu, Ruhui
    Chou, Yen-Chi
    Chung, Yen-Lin
    Shih, William
    Lo, Chung-Chuan
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Lien, Nan-Chun
    Shih, Wei-Chiang
    He, Yajuan
    Li, Qiang
    Chang, Meng-Fan
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (09) : 2817 - 2831
  • [7] A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips
    Su, Jian-Wei
    Chou, Yen-Chi
    Liu, Ruhui
    Liu, Ta-Wei
    Lu, Pei-Jung
    Wu, Ping-Chun
    Chung, Yen-Lin
    Hung, Li-Yang
    Ren, Jin-Sheng
    Pan, Tianlong
    Li, Sih-Han
    Chang, Shih-Chieh
    Sheu, Shyh-Shyuan
    Lo, Wei-Chung
    Wu, Chih-, I
    Si, Xin
    Lo, Chung-Chuan
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. 2021 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2021, 64 : 250 - +
  • [8] Split WL 6T SRAM-Based Bit Serial Computing-in-Memory Macro With High Signal Margin and High Throughput
    Lee, Young Kyu
    Ko, Dong Han
    Cho, Seokhee
    Yeo, Minjune
    Kang, Mingu
    Jung, Seong-Ook
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (04) : 1869 - 1873
  • [9] 8-Bit Precision 6T SRAM Compute-in-Memory Macro Using Global Bitline-Combining Scheme for Edge AI Chips
    Su, Jian-Wei
    Lu, Pei-Jung
    Wu, Ping-Chun
    Chou, Yen-Chi
    Liu, Ta-Wei
    Chung, Yen-Lin
    Hung, Li-Yang
    Ren, Jin-Sheng
    Huang, Wei-Hsing
    Chien, Chih-Han
    Mei, Peng-, I
    Li, Sih-Han
    Sheu, Shyh-Shyuan
    Lo, Wei-Chung
    Chang, Shih-Chieh
    Hong, Hao-Chiao
    Lo, Chung-Chuan
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (04) : 2304 - 2308
  • [10] Embedded 1-Mb ReRAM-Based Computing-in- Memory Macro With Multibit Input and Weight for CNN-Based AI Edge Processors
    Xue, Cheng-Xin
    Chen, Wei-Hao
    Liu, Je-Syu
    Li, Jia-Fang
    Lin, Wei-Yu
    Lin, Wei-En
    Wang, Jing-Hong
    Wei, Wei-Chen
    Huang, Tsung-Yuan
    Chang, Ting-Wei
    Chang, Tung-Cheng
    Kao, Hui-Yao
    Chiu, Yen-Cheng
    Lee, Chun-Ying
    King, Ya-Chin
    Lin, Chrong-Jung
    Liu, Ren-Shuo
    Hsieh, Chih-Cheng
    Tang, Kea-Tiong
    Chang, Meng-Fan
    [J]. IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2020, 55 (01) : 203 - 215