Enabling Training of Neural Networks on Noisy Hardware

被引:18
|
作者
Gokmen, Tayfun [1 ]
机构
[1] IBM Res AI, Yorktown Hts, NY 10598 USA
来源
关键词
learning algorithms; training algorithms; neural network acceleration; Bayesian neural network; inmemory computing; on-chip learning; crossbar arrays; memristor;
D O I
10.3389/frai.2021.699148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) are typically trained using the conventional stochastic gradient descent (SGD) algorithm. However, SGD performs poorly when applied to train networks on non-ideal analog hardware composed of resistive device arrays with non-symmetric conductance modulation characteristics. Recently we proposed a new algorithm, the Tiki-Taka algorithm, that overcomes this stringent symmetry requirement. Here we build on top of Tiki-Taka and describe a more robust algorithm that further relaxes other stringent hardware requirements. This more robust second version of the Tiki-Taka algorithm (referred to as TTv2) 1. decreases the number of device conductance states requirement from 1000s of states to only 10s of states, 2. increases the noise tolerance to the device conductance modulations by about 100x, and 3. increases the noise tolerance to the matrix-vector multiplication performed by the analog arrays by about 10x. Empirical simulation results show that TTv2 can train various neural networks close to their ideal accuracy even at extremely noisy hardware settings. TTv2 achieves these capabilities by complementing the original Tiki-Taka algorithm with lightweight and low computational complexity digital filtering operations performed outside the analog arrays. Therefore, the implementation cost of TTv2 compared to SGD and Tiki-Taka is minimal, and it maintains the usual power and speed benefits of using analog hardware for training workloads. Here we also show how to extract the neural network from the analog hardware once the training is complete for further model deployment. Similar to Bayesian model averaging, we form analog hardware compatible averages over the neural network weights derived from TTv2 iterates. This model average then can be transferred to another analog or digital hardware with notable improvements in test accuracy, transcending the trained model itself. In short, we describe an end-to-end training and model extraction technique for extremely noisy crossbar-based analog hardware that can be used to accelerate DNN training workloads and match the performance of full-precision SGD.
引用
下载
收藏
页数:14
相关论文
共 50 条
  • [31] Neural networks in VLSI hardware
    Clarkson, T
    NEURAL NETWORKS AND THEIR APPLICATIONS, 1996, : 245 - 253
  • [32] Hardware reconfigurable neural networks
    Beuchat, JL
    Haenni, JO
    Sanchez, E
    PARALLEL AND DISTRIBUTED PROCESSING, 1998, 1388 : 91 - 98
  • [33] DCBT-Net: Training Deep Convolutional Neural Networks With Extremely Noisy Labels
    Olimov, Bekhzod
    Kim, Jeonghong
    Paul, Anand
    IEEE ACCESS, 2020, 8 : 220482 - 220495
  • [34] Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels
    Han, Bo
    Yao, Quanming
    Yu, Xingrui
    Niu, Gang
    Xu, Miao
    Hu, Weihua
    Tsang, Ivor W.
    Sugiyama, Masashi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [35] A Study on the Impact of Data Augmentation for Training Convolutional Neural Networks in the Presence of Noisy Labels
    Pereira, Emeson
    Carneiro, Gustavo
    Cordeiro, Filipe R.
    2022 35TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI 2022), 2022, : 25 - 30
  • [36] An analysis of noisy recurrent neural networks
    Das, S
    Olurotimi, O
    ICNN - 1996 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS. 1-4, 1996, : 1297 - 1301
  • [37] Simulate-the-hardware: Training Accurate Binarized Neural Networks for Low-Precision Neural Accelerators
    Li, Jiajun
    Wang, Ying
    Liu, Bosheng
    Han, Yinhe
    Li, Xiaowei
    24TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC 2019), 2019, : 323 - 328
  • [38] Training Robust Deep Neural Networks on Noisy Labels Using Adaptive Sample Selection With Disagreement
    Takeda, Hiroshi
    Yoshida, Soh
    Muneyasu, Mitsuji
    IEEE ACCESS, 2021, 9 : 141131 - 141143
  • [39] Improved Categorical Cross-Entropy Loss for Training Deep Neural Networks with Noisy Labels
    Li, Panle
    He, Xiaohui
    Song, Dingjun
    Ding, Zihao
    Qiao, Mengjia
    Cheng, Xijie
    Li, Runchuan
    PATTERN RECOGNITION AND COMPUTER VISION, PT IV, 2021, 13022 : 78 - 89
  • [40] Improved Hopfield networks by training with noisy data
    Clift, F
    Martinez, TR
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1138 - 1143