Implementation of Convolutional Neural Network for Speech Recognition

被引：0

作者：

Wang, Zhichao ^{[1
]}

Na, Xingyu ^{[1
]}

Liu, Yong ^{[1
]}

Pan, Jielin ^{[1
]}

Yan, Yonghong ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing, Peoples R China

来源：

INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014) | 2014年

关键词：

Speech recognition; Convolutional neural network; Square structure; Acceleration;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Thanks to the contribution of Professor Hinton, a pre-trained context-dependent hybrid deep neural network (DNN) has achieved significant performance gain in many automatic speech recognition tasks. And convolutional neural network (CNN) is another type of neural network which has been successfully applied in many image processing tasks. Due to its unique structure (local filtering and pooling), CNN has some advantage over DNN such as smaller model size and translation invariance. In this paper, we implement convolutional neural network for speech recognition. We first apply CNN on the TIMIT database using the method proposed in [1]. Although it shows gains compared with DNN, it is too slow to train the net. Then we proposed a method to accelerate the training. Next we apply the system for the Mandarin task which have more data and is much noisier than TIMIT. The experiments indicate that it also works and we get the best results with two convolutional layers.

引用

页码：239 / 243

页数：5

共 50 条

[1] Crossmixed convolutional neural network for digital speech recognition
Diep, Quoc Bao
Phan, Hong Yen
Truong, Thanh-Cong
[J]. PLOS ONE, 2024, 19 (04):
[2] Deep Convolutional Neural Network for Arabic Speech Recognition
Amari, Rafik
Noubigh, Zouhaira
Zrigui, Salah
Berchech, Dhaou
Nicolas, Henri
Zrigui, Mounir
[J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
[3] Continuous Speech Recognition based on Convolutional Neural Network
Zhang, Qing-qing
Liu, Yong
Pan, Jie-lin
Yan, Yong-hong
[J]. SEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2015), 2015, 9631
[4] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
Parthasarathy, Srinivas
Tashev, Ivan
[J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
[5] Design of a Convolutional Neural Network for Speech Emotion Recognition
Lee, Kyong Hee
Kim, Do Hyun
[J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
[6] Multiresolution Convolutional Neural Network For Robust Speech Recognition
Naderi, Navid
Nasersharif, Babak
[J]. 2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1459 - 1464
[7] Efficient GPU implementation of convolutional neural networks for speech recognition
van den Berg, Ewout
Brand, Daniel
Bordawekar, Rajesh
Rachevsky, Leonid
Ramabhadran, Bhuvana
[J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1483 - 1487
[8] A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network
Kubanek, Mariusz
Bobulski, Janusz
Kulawik, Joanna
[J]. SYMMETRY-BASEL, 2019, 11 (09): : 1 - 12
[9] Automatic Speech Recognition trained with Convolutional Neural Network and predicted with Recurrent Neural Network
Soundarya, M.
Karthikeyan, P. R.
Thangarasu, Gunasekar
[J]. 2023 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENERGY SYSTEMS, ICEES, 2023, : 41 - 45
[10] Audiovisual speech recognition based on a deep convolutional neural network
Rudregowda S.
Patilkulkarni S.
Ravi V.
H.L. G.
Krichen M.
[J]. Data Science and Management, 2024, 7 (01): : 25 - 34

← 1 2 3 4 5 →