Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models

被引:0
|
作者
Kim, Chanwoo [1 ,3 ]
Variani, Ehsan [2 ]
Narayanan, Arun [2 ]
Bacchiani, Michiel [2 ]
机构
[1] Samsung Res, Seoul, South Korea
[2] Google Speech, Mountain View, CA USA
[3] Google, Mountain View, CA USA
关键词
Simulated data; room acoustics; robust speech recognition; deep learning;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we describe how to efficiently implement an acoustic room simulator to generate large-scale simulated data for training deep neural networks. Even though Google Room Simulator in [1] was shown to be quite effective in reducing the Word Error Rates (WERs) for far-field applications by generating simulated far-field training sets, it requires a very large number of FFTs. Room Simulator used approximately 80 % of CPU usage in our CPU/GPU training architecture [2]. In this work, we implement an efficient OverLap Addition (OLA) based filtering using the open-source FFTW3 library. Further, we investigate the effects of the Room Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the tail portions of RIRs whose power is less than 20 dB below the maximum power without sacrificing the speech recognition accuracy. However, we observe that cutting RIR tail more than this threshold harms the speech recognition accuracy for rerecorded test sets. Using these approaches, we were able to reduce CPU usage for the room simulator portion down to 9.69 % in CPU/GPU training architecture. Profiling result shows that we obtain 22.4 times speed-up on a single machine and 37.3 times speed up on Google's distributed training infrastructure.
引用
收藏
页码:3028 / 3032
页数:5
相关论文
共 50 条
  • [21] EPMC: efficient parallel memory compression in deep neural network training
    Zailong Chen
    Shenghong Yang
    Chubo Liu
    Yikun Hu
    Kenli Li
    Keqin Li
    Neural Computing and Applications, 2022, 34 : 757 - 769
  • [22] Challenges in Energy-Efficient Deep Neural Network Training with FPGA
    Tao, Yudong
    Ma, Rui
    Shyu, Mei-Ling
    Chen, Shu-Ching
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1602 - 1611
  • [23] EPMC: efficient parallel memory compression in deep neural network training
    Chen, Zailong
    Yang, Shenghong
    Liu, Chubo
    Hu, Yikun
    Li, Kenli
    Li, Keqin
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 757 - 769
  • [24] INCREMENTAL TRAINING AND CONSTRUCTING THE VERY DEEP CONVOLUTIONAL RESIDUAL NETWORK ACOUSTIC MODELS
    Li, Sheng
    Lu, Xugang
    Shen, Peng
    Takashima, Ryoichi
    Kawahara, Tatsuya
    Kawai, Hisashi
    2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 222 - 227
  • [25] EFFICIENT TRAINING OF NEURAL-NETWORK MODELS IN CLASSIFICATION OF ELECTROMYOGRAPHIC DATA
    PATTICHIS, CS
    CHARALAMBOUS, C
    MIDDLETON, LT
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 1995, 33 (03) : 499 - 503
  • [26] Implementation of a Neural Network Using Simulator and Petri Nets
    Nenkov, Nayden Valkov
    Spasova, Elitsa Zdravkova
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (01) : 412 - 417
  • [27] Reconfigurable Neural Network Accelerator and Simulator for Model Implementation
    Nakahara, Yasuhiro
    Kiyama, Masato
    Amagasaki, Motoki
    Zhao, Qian
    Iida, Masahiro
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2022, E105A (03) : 448 - 458
  • [28] Complementary tasks for context-dependent deep neural network acoustic models
    Bell, Peter
    Renals, Steve
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3610 - 3614
  • [29] IMPROVING DEEP NEURAL NETWORK ACOUSTIC MODELS USING GENERALIZED MAXOUT NETWORKS
    Zhang, Xiaohui
    Trmal, Jan
    Povey, Daniel
    Khudanpur, Sanjeev
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [30] At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation
    Hidayetoglu, Mert
    Pearson, Carl
    Mailthody, Vikram Sharma
    Ebrahimi, Eiman
    Xiong, Jinjun
    Nagi, Rakesh
    Hwu, Wen-mei
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,