A Weighted Current Summation Based Mixed Signal DRAM-PIM Architecture for Deep Neural Network Inference

被引:3
|
作者
Sudarshan, Chirag [1 ]
Soliman, Taha [2 ]
Lappas, Jan [1 ]
Weis, Christian [1 ]
Sadi, Mohammad Hassani [1 ]
Jung, Matthias [3 ]
Guntoro, Andre [2 ]
Wehn, Norbert [1 ]
机构
[1] TU Kaiserslautern, Microelect Syst Design Res Grp, D-67663 Kaiserslautern, Germany
[2] Robert Bosch GmbH Corp Res CR ADT2, D-70839 Stuttgart, Germany
[3] Fraunhofer Inst Expt Software Engn IESE, D-67663 Kaiserslautern, Germany
关键词
Computer architecture; Random access memory; Parallel processing; Optical wavelength conversion; Neural networks; Kernel; Performance evaluation; Processing-in-memory; PIM; compute-in-memory; CIM; DRAM; CNN; DNN; EFFICIENT;
D O I
10.1109/JETCAS.2022.3170235
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Processing-in-Memory (PIM) is an emerging approach to bridge the memory-computation gap. One of the major challenges of PIM architectures in the scope of Deep Neural Network (DNN) inference is the implementation of area-intensive Multiply-Accumulate (MAC) units in memory technologies, especially for DRAM-based PIMs. The DRAM architecture restricts the integration of DNN computation units near the area optimized commodity DRAM Sub-Array (SA) or Primary Sense Amplifier (PSA) region, where the data parallelism is maximum and the data movement cost is minimum. In this paper, we present a novel DRAM-based PIM architecture that is based on bit-decomposed MAC operation and Weighted Current Summation (WCS) technique to implement the MAC unit with minimal additional circuitry in the PSA region by leveraging on mixed-signal design. The architecture presents a two-stage design that employs light-weight current mirror based analog units near the SAs in the PSA region, whereas all the other substantial logic is integrated near the bank peripheral region. Hence, our architecture attains a balance between the data parallelism, data movement energy and area optimization. For an 8-bit CNN inference, our novel 8Gb DRAM PIM device achieves a peak performance of 142.8GOPS while consuming a power of 756.76mW, which results in an energy efficiency of 188.8GOPS/W. The area overhead of such an 8Gb device for a 2ynm DRAM technology is 12.63% in comparison to a commodity 8Gb DRAM device.
引用
收藏
页码:367 / 380
页数:14
相关论文
共 50 条
  • [1] Optimization of DRAM based PIM Architecture for Energy-Efficient Deep Neural Network Training
    Sudarshan, Chirag
    Sadi, Mohammad Hassani
    Weis, Christian
    Wehn, Norbert
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 1472 - 1476
  • [2] Mixed-Signal Computing for Deep Neural Network Inference
    Murmann, Boris
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2021, 29 (01) : 3 - 13
  • [3] Memristive-based Mixed-signal CGRA for Accelerating Deep Neural Network Inference
    Kazerooni-Zand, Reza
    Kamal, Mehdi
    Afzali-Kusha, Ali
    Pedram, Massoud
    [J]. ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2023, 28 (04)
  • [4] A Modular Mixed-Signal CVNS Neural Network Architecture
    Saffar, Farinoush
    Mirhassani, Mitra
    Ahmadi, Majid
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [5] A Mixed-Signal Spiking Neuromorphic Architecture for Scalable Neural Network
    Luo, Chong
    Ying, Zhaozhong
    Zhu, Xiaolei
    Chen, Longlong
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2017), VOL 1, 2017, : 179 - 182
  • [6] INFERENCE OF URBAN FUNCTION ZONE BASED ON DEEP NEURAL NETWORK
    Hou, Ankai
    Zhu, Mingcang
    Li, Pengshan
    He, Yong
    Zhang, Xiaobo
    Shi, Jibao
    Chen, Kai
    Weng, Tao
    Zheng, Zezhong
    Zhou, Guoqing
    [J]. IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 4410 - 4413
  • [7] A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference
    Le Gallo, Manuel
    Khaddam-Aljameh, Riduan
    Stanisavljevic, Milos
    Vasilopoulos, Athanasios
    Kersting, Benedikt
    Dazzi, Martino
    Karunaratne, Geethan
    Brandli, Matthias
    Singh, Abhairaj
    Mueller, Silvia M.
    Buchel, Julian
    Timoneda, Xavier
    Joshi, Vinay
    Rasch, Malte J.
    Egger, Urs
    Garofalo, Angelo
    Petropoulos, Anastasios
    Antonakopoulos, Theodore
    Brew, Kevin
    Choi, Samuel
    Ok, Injo
    Philip, Timothy
    Chan, Victor
    Silvestre, Claire
    Ahsan, Ishtiaq
    Saulnier, Nicole
    Narayanan, Vijay
    Francese, Pier Andrea
    Eleftheriou, Evangelos
    Sebastian, Abu
    [J]. NATURE ELECTRONICS, 2023, 6 (09) : 680 - +
  • [8] A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference
    Manuel Le Gallo
    Riduan Khaddam-Aljameh
    Milos Stanisavljevic
    Athanasios Vasilopoulos
    Benedikt Kersting
    Martino Dazzi
    Geethan Karunaratne
    Matthias Brändli
    Abhairaj Singh
    Silvia M. Müller
    Julian Büchel
    Xavier Timoneda
    Vinay Joshi
    Malte J. Rasch
    Urs Egger
    Angelo Garofalo
    Anastasios Petropoulos
    Theodore Antonakopoulos
    Kevin Brew
    Samuel Choi
    Injo Ok
    Timothy Philip
    Victor Chan
    Claire Silvestre
    Ishtiaq Ahsan
    Nicole Saulnier
    Vijay Narayanan
    Pier Andrea Francese
    Evangelos Eleftheriou
    Abu Sebastian
    [J]. Nature Electronics, 2023, 6 : 680 - 693
  • [9] Pedestrian Detection Based on Deep Convolutional Neural Network with Ensemble Inference Network
    Fukui, Hiroshi
    Yamashita, Takayoshi
    Yamauchi, Yuji
    Fujiyoshi, Hironobu
    Murase, Hiroshi
    [J]. 2015 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2015, : 223 - 228
  • [10] A generalized neural network architecture based on distributed signal processing
    Demirkol, Askin
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, PROCEEDINGS, 2006, 4062 : 377 - 382