Hardware-Aware Softmax Approximation for Deep Neural Networks

被引:13
|
作者
Geng, Xue [1 ]
Lin, Jie [1 ]
Zhao, Bin [2 ]
Kong, Anmin [2 ]
Aly, Mohamed M. Sabry [3 ]
Chandrasekhar, Vijay [1 ]
机构
[1] ASTAR, I2R, Singapore, Singapore
[2] ASTAR, IME, Singapore, Singapore
[3] Nanyang Technol Univ, Sch CSE, Singapore, Singapore
来源
关键词
Softmax; Nonlinear operation; Power; Area;
D O I
10.1007/978-3-030-20870-7_7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
There has been a rapid development of custom hardware for accelerating the inference speed of deep neural networks (DNNs), by explicitly incorporating hardware metrics (e.g., area and energy) as additional constraints, in addition to application accuracy. Recent efforts mainly focused on linear functions (matrix multiplication) in convolutional (Conv) or fully connected (FC) layers, while there is no publicly available study on optimizing the inference of non-linear functions in DNNs, with hardware constraints. In this paper, we address the problem of cost-efficient inference for Softmax, a popular non-linear function in DNNs. We introduce a hardware-aware linear approximation framework by algorithm and hardware co-optimization, with the goal of minimizing the cost in terms of area and energy, without incurring significant loss in application accuracy. This is achieved by simultaneously reducing the operand bit-width and approximating cost-intensive operations in Softmax (e.g. exponential and division) with cost-effective operations (e.g. addition and bit shifts). We designed and synthesized a hardware unit for our approximation approach, to estimate the area and energy consumption. In addition, we introduce a training method to further save area and energy cost, by reduced precision. Our approach reduces area cost by 13x and energy consumption by 2x with 11-bit operand width, compared to baseline at 19-bit for VOC2007 dataset in Faster R-CNN.
引用
收藏
页码:107 / 122
页数:16
相关论文
共 50 条
  • [21] Evolution of Hardware-Aware Neural Architecture Search on the Edge
    Richey, Blake
    Clay, Mitchell
    Grecos, Christos
    Shirvaikar, Mukul
    [J]. REAL-TIME IMAGE PROCESSING AND DEEP LEARNING 2023, 2023, 12528
  • [22] Hardware-Aware Quantization for Multiplierless Neural Network Controllers
    Habermann, Tobias
    Kuehle, Jonas
    Kumm, Martin
    Volkova, Anastasia
    [J]. 2022 IEEE ASIA PACIFIC CONFERENCE ON CIRCUITS AND SYSTEMS, APCCAS, 2022, : 541 - 545
  • [23] Generating Neural Networks for Diverse Networking Classification Tasks via Hardware-Aware Neural Architecture Search
    Xie, Guorui
    Li, Qing
    Shi, Zhenning
    Fang, Hanbin
    Ji, Shengpeng
    Jiang, Yong
    Yuan, Zhenhui
    Ma, Lianbo
    Xu, Mingwei
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2024, 73 (02) : 481 - 494
  • [24] QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks
    Xu, Chenhui
    Yu, Fuxun
    Xu, Zirui
    Liu, Chenchen
    Xiong, Jinjun
    Chen, Xiang
    [J]. 29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 19 - 25
  • [25] Hardware-aware neural architecture search for stochastic computing-based neural networks on tiny devices
    Song, Yuhong
    Sha, Edwin Hsing-Mean
    Zhuge, Qingfeng
    Xu, Rui
    Xu, Xiaowei
    Li, Bingzhe
    Yang, Lei
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2023, 135
  • [26] A Design Framework for Hardware Approximation of Deep Neural Networks
    Lin, Wei-Hung
    Kao, Hsu-Yu
    Huang, Shih-Hsu
    [J]. 2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [27] HAO: Hardware-aware Neural Architecture Optimization for Efficient Inference
    Dong, Zhen
    Gao, Yizhao
    Huang, Qijing
    Wawrzynek, John
    So, Hayden K. H.
    Keutzer, Kurt
    [J]. 2021 IEEE 29TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2021), 2021, : 50 - 59
  • [28] Hardware-Aware Automated Neural Minimization for Printed Multilayer Perceptrons
    Kokkinis, Argyris
    Zervakis, Georgios
    Siozios, Kostas
    Tahoori, Mehdi B.
    Henkel, Jorg
    [J]. 2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [29] Hardware-Aware Zero-Shot Neural Architecture Search
    Yoshihama, Yutaka
    Yadani, Kenichi
    Isobe, Shota
    [J]. 2023 18TH INTERNATIONAL CONFERENCE ON MACHINE VISION AND APPLICATIONS, MVA, 2023,
  • [30] Hardware-Aware Model of Sigma-Delta Cellular Neural Network
    Aomori, Hisashi
    Naito, Yuki
    Otake, Tsuyoshi
    Takahashi, Nobuaki
    Matsuda, Ichiro
    Itoh, Susumu
    Tanaka, Mamoru
    [J]. 2009 EUROPEAN CONFERENCE ON CIRCUIT THEORY AND DESIGN, VOLS 1 AND 2, 2009, : 311 - +