Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models

被引:0
|
作者
Kim, Chanwoo [1 ,3 ]
Variani, Ehsan [2 ]
Narayanan, Arun [2 ]
Bacchiani, Michiel [2 ]
机构
[1] Samsung Res, Seoul, South Korea
[2] Google Speech, Mountain View, CA USA
[3] Google, Mountain View, CA USA
关键词
Simulated data; room acoustics; robust speech recognition; deep learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we describe how to efficiently implement an acoustic room simulator to generate large-scale simulated data for training deep neural networks. Even though Google Room Simulator in [1] was shown to be quite effective in reducing the Word Error Rates (WERs) for far-field applications by generating simulated far-field training sets, it requires a very large number of FFTs. Room Simulator used approximately 80 % of CPU usage in our CPU/GPU training architecture [2]. In this work, we implement an efficient OverLap Addition (OLA) based filtering using the open-source FFTW3 library. Further, we investigate the effects of the Room Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the tail portions of RIRs whose power is less than 20 dB below the maximum power without sacrificing the speech recognition accuracy. However, we observe that cutting RIR tail more than this threshold harms the speech recognition accuracy for rerecorded test sets. Using these approaches, we were able to reduce CPU usage for the room simulator portion down to 9.69 % in CPU/GPU training architecture. Profiling result shows that we obtain 22.4 times speed-up on a single machine and 37.3 times speed up on Google's distributed training infrastructure.
引用
收藏
页码:3028 / 3032
页数:5
相关论文
共 50 条
  • [31] A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training
    Li, Dongsheng
    Li, Shengwei
    Lai, Zhiquan
    Fu, Yongquan
    Ye, Xiangyu
    Cai, Lei
    Qiao, Linbo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (04) : 577 - 591
  • [32] Communication-Efficient Parallelization Strategy for Deep Convolutional Neural Network Training
    Lee, Sunwoo
    Agrawal, Ankit
    Balaprakash, Prasanna
    Choudhary, Alok
    Liao, Wei-keng
    PROCEEDINGS OF 2018 IEEE/ACM MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC 2018), 2018, : 47 - 56
  • [33] Efficient Dynamic Device Placement for Deep Neural Network Training on Heterogeneous Systems
    Huang, Zi Xuan
    Fu, Shen Yu
    Hsu, Wei Chung
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2019, 2019, 11733 : 51 - 64
  • [34] Computational Storage for an Energy-Efficient Deep Neural Network Training System
    Li, Shiju
    Tang, Kevin
    Lim, Jin
    Lee, Chul-Ho
    Kim, Jongryool
    EURO-PAR 2023: PARALLEL PROCESSING, 2023, 14100 : 304 - 319
  • [35] Efficient deep neural network training via decreasing precision with layer capacity
    Ao Shen
    Zhiquan Lai
    Tao Sun
    Shengwei Li
    Keshi Ge
    Weijie Liu
    Dongsheng Li
    Frontiers of Computer Science, 2025, 19 (10)
  • [36] Efficient implementation of the THSOM neural network
    Marek, Rudolf
    Skrbek, Miroslav
    ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT II, 2008, 5164 : 159 - 168
  • [37] Efficient Implementation of Neural Network Deinterlacing
    Seo, Guiwon
    Choi, Hyunsoo
    Lee, Chulhee
    IMAGE PROCESSING: ALGORITHMS AND SYSTEMS VII, 2009, 7245
  • [38] Visualization in Deep Neural Network Training
    Kollias, Stefanos
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2022, 31 (03)
  • [39] Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization
    Kingsbury, Brian
    Sainath, Tara N.
    Soltau, Hagen
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 10 - 13
  • [40] Hybrid Neural Network for Efficient Training
    Hossain, Md. Billal
    Islam, Sayeed
    Zhumur, Noor-e-Hafsa
    Khanam, Najmoon Nahar
    Khan, Md. Imran
    Kabir, Md. Ahasan
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION ENGINEERING (ECCE), 2017, : 528 - 532