A Small-Footprint Accelerator for Large-Scale Neural Networks

被引:3
|
作者
Chen, Tianshi [1 ]
Zhang, Shijin [1 ]
Liu, Shaoli [1 ]
Du, Zidong [1 ]
Luo, Tao [1 ]
Gao, Yuan [2 ]
Liu, Junjie [2 ]
Wang, Dongsheng [2 ]
Wu, Chengyong [1 ]
Sun, Ninghui [1 ]
Chen, Yunji [1 ,4 ]
Temam, Olivier [3 ]
机构
[1] Chinese Acad Sci, ICT, SKLCA, Beijing 100190, Peoples R China
[2] Tsinghua Univ, TNLIST, Beijing 100084, Peoples R China
[3] Inria, Saclay, France
[4] Chinese Acad Sci, Ctr Excellence Brain Sci, Beijing 100190, Peoples R China
来源
ACM TRANSACTIONS ON COMPUTER SYSTEMS | 2015年 / 33卷 / 02期
关键词
Architecture; Processor; Hardware; RECOGNITION;
D O I
10.1145/2701417
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Machine-learning tasks are becoming pervasive in a broad range of domains, and in a broad range of systems (from embedded systems to data centers). At the same time, a small set of machine-learning algorithms (especially Convolutional and Deep Neural Networks, i.e., CNNs and DNNs) are proving to be state-of-the-art across many applications. As architectures evolve toward heterogeneous multicores composed of a mix of cores and accelerators, a machine-learning accelerator can achieve the rare combination of efficiency (due to the small number of target algorithms) and broad application scope. Until now, most machine-learning accelerator designs have been focusing on efficiently implementing the computational part of the algorithms. However, recent state-of-the-art CNNs and DNNs are characterized by their large size. In this study, we design an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance, and energy. We show that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions) in a small footprint of 3.02 mm(2) and 485mW; compared to a 128-bit 2GHz SIMD processor, the accelerator is 117.87x faster, and it can reduce the total energy by 21.08x. The accelerator characteristics are obtained after layout at 65nm. Such a high throughput in a small footprint can open up the usage of state-of-the-art machine-learning algorithms in a broad set of systems and for a broad set of applications.
引用
下载
收藏
页数:27
相关论文
共 50 条
  • [1] Convolutional Neural Networks for Small-footprint Keyword Spotting
    Sainath, Tara N.
    Parada, Carolina
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1478 - 1482
  • [2] SMALL-FOOTPRINT KEYWORD SPOTTING USING DEEP NEURAL NETWORKS
    Chen, Guoguo
    Parada, Carolina
    Heigold, Georg
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [3] Small-Footprint Highway Deep Neural Networks for Speech Recognition
    Lu, Liang
    Renals, Steve
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (07) : 1502 - 1511
  • [4] Convolutional Recurrent Neural Networks for Small-Footprint Keyword Spotting
    Arik, Sercan O.
    Kliegl, Markus
    Child, Rewon
    Hestness, Joel
    Gibiansky, Andrew
    Fougner, Chris
    Prenger, Ryan
    Coates, Adam
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1606 - 1610
  • [5] Practical large-scale forest stand inventory using a small-footprint airborne scanning laser
    Næsset, E
    SCANDINAVIAN JOURNAL OF FOREST RESEARCH, 2004, 19 (02) : 164 - 179
  • [6] A Configurable Accelerator for Keyword Spotting Based on Small-Footprint Temporal Efficient Neural Network
    He, Keyan
    Chen, Dihu
    Su, Tao
    ELECTRONICS, 2022, 11 (16)
  • [7] Small-footprint Deep Neural Networks with Highway Connections for Speech Recognition
    Lu, Liang
    Renals, Steve
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 12 - 16
  • [8] A High-Performance Accelerator for Large-Scale Convolutional Neural Networks
    Sun, Fan
    Wang, Chao
    Gong, Lei
    Xu, Chongchong
    Zhang, Yiwei
    Lu, Yuntao
    Li, Xi
    Zhou, Xuehai
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 622 - 629
  • [9] KNOWLEDGE DISTILLATION FOR SMALL-FOOTPRINT HIGHWAY NETWORKS
    Lu, Liang
    Guo, Michelle
    Renals, Steve
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4820 - 4824
  • [10] Small-footprint Spiking Neural Networks for Power-efficient Keyword Spotting
    Pedroni, Bruno U.
    Sheik, Sadique
    Mostafa, Hesham
    Paul, Somnath
    Augustine, Charles
    Cauwenberghs, Gert
    2018 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS): ADVANCED SYSTEMS FOR ENHANCING HUMAN HEALTH, 2018, : 591 - 594