Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models

被引：0

作者：

Kim, Chanwoo ^{[1
,3
]}

Variani, Ehsan ^{[2
]}

Narayanan, Arun ^{[2
]}

Bacchiani, Michiel ^{[2
]}

机构：

[1] Samsung Res, Seoul, South Korea

[2] Google Speech, Mountain View, CA USA

[3] Google, Mountain View, CA USA

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

Simulated data; room acoustics; robust speech recognition; deep learning;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we describe how to efficiently implement an acoustic room simulator to generate large-scale simulated data for training deep neural networks. Even though Google Room Simulator in [1] was shown to be quite effective in reducing the Word Error Rates (WERs) for far-field applications by generating simulated far-field training sets, it requires a very large number of FFTs. Room Simulator used approximately 80 % of CPU usage in our CPU/GPU training architecture [2]. In this work, we implement an efficient OverLap Addition (OLA) based filtering using the open-source FFTW3 library. Further, we investigate the effects of the Room Impulse Response (RIR) lengths. Experimentally, we conclude that we can cut the tail portions of RIRs whose power is less than 20 dB below the maximum power without sacrificing the speech recognition accuracy. However, we observe that cutting RIR tail more than this threshold harms the speech recognition accuracy for rerecorded test sets. Using these approaches, we were able to reduce CPU usage for the room simulator portion down to 9.69 % in CPU/GPU training architecture. Profiling result shows that we obtain 22.4 times speed-up on a single machine and 37.3 times speed up on Google's distributed training infrastructure.

引用

页码：3028 / 3032

页数：5

共 50 条

[21] EPMC: efficient parallel memory compression in deep neural network training
Zailong Chen
Shenghong Yang
Chubo Liu
Yikun Hu
Kenli Li
Keqin Li
Neural Computing and Applications, 2022, 34 : 757 - 769
[22] Challenges in Energy-Efficient Deep Neural Network Training with FPGA
Tao, Yudong
Ma, Rui
Shyu, Mei-Ling
Chen, Shu-Ching
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 1602 - 1611
[23] EPMC: efficient parallel memory compression in deep neural network training
Chen, Zailong
Yang, Shenghong
Liu, Chubo
Hu, Yikun
Li, Kenli
Li, Keqin
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (01): : 757 - 769
[24] INCREMENTAL TRAINING AND CONSTRUCTING THE VERY DEEP CONVOLUTIONAL RESIDUAL NETWORK ACOUSTIC MODELS
Li, Sheng
Lu, Xugang
Shen, Peng
Takashima, Ryoichi
Kawahara, Tatsuya
Kawai, Hisashi
2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 222 - 227
[25] EFFICIENT TRAINING OF NEURAL-NETWORK MODELS IN CLASSIFICATION OF ELECTROMYOGRAPHIC DATA
PATTICHIS, CS
CHARALAMBOUS, C
MIDDLETON, LT
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 1995, 33 (03) : 499 - 503
[26] Implementation of a Neural Network Using Simulator and Petri Nets
Nenkov, Nayden Valkov
Spasova, Elitsa Zdravkova
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (01) : 412 - 417
[27] Reconfigurable Neural Network Accelerator and Simulator for Model Implementation
Nakahara, Yasuhiro
Kiyama, Masato
Amagasaki, Motoki
Zhao, Qian
Iida, Masahiro
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2022, E105A (03) : 448 - 458
[28] Complementary tasks for context-dependent deep neural network acoustic models
Bell, Peter
Renals, Steve
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3610 - 3614
[29] IMPROVING DEEP NEURAL NETWORK ACOUSTIC MODELS USING GENERALIZED MAXOUT NETWORKS
Zhang, Xiaohui
Trmal, Jan
Povey, Daniel
Khudanpur, Sanjeev
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[30] At-Scale Sparse Deep Neural Network Inference With Efficient GPU Implementation
Hidayetoglu, Mert
Pearson, Carl
Mailthody, Vikram Sharma
Ebrahimi, Eiman
Xiong, Jinjun
Nagi, Rakesh
Hwu, Wen-mei
2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,

← 1 2 3 4 5 →