Implementation of Convolutional Neural Network for Speech Recognition

被引:0
|
作者
Wang, Zhichao [1 ]
Na, Xingyu [1 ]
Liu, Yong [1 ]
Pan, Jielin [1 ]
Yan, Yonghong [1 ]
机构
[1] Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing, Peoples R China
关键词
Speech recognition; Convolutional neural network; Square structure; Acceleration;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Thanks to the contribution of Professor Hinton, a pre-trained context-dependent hybrid deep neural network (DNN) has achieved significant performance gain in many automatic speech recognition tasks. And convolutional neural network (CNN) is another type of neural network which has been successfully applied in many image processing tasks. Due to its unique structure (local filtering and pooling), CNN has some advantage over DNN such as smaller model size and translation invariance. In this paper, we implement convolutional neural network for speech recognition. We first apply CNN on the TIMIT database using the method proposed in [1]. Although it shows gains compared with DNN, it is too slow to train the net. Then we proposed a method to accelerate the training. Next we apply the system for the Mandarin task which have more data and is much noisier than TIMIT. The experiments indicate that it also works and we get the best results with two convolutional layers.
引用
收藏
页码:239 / 243
页数:5
相关论文
共 50 条
  • [1] Crossmixed convolutional neural network for digital speech recognition
    Diep, Quoc Bao
    Phan, Hong Yen
    Truong, Thanh-Cong
    [J]. PLOS ONE, 2024, 19 (04):
  • [2] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [3] Continuous Speech Recognition based on Convolutional Neural Network
    Zhang, Qing-qing
    Liu, Yong
    Pan, Jie-lin
    Yan, Yong-hong
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2015), 2015, 9631
  • [4] CONVOLUTIONAL NEURAL NETWORK TECHNIQUES FOR SPEECH EMOTION RECOGNITION
    Parthasarathy, Srinivas
    Tashev, Ivan
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 121 - 125
  • [5] Design of a Convolutional Neural Network for Speech Emotion Recognition
    Lee, Kyong Hee
    Kim, Do Hyun
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1332 - 1335
  • [6] Multiresolution Convolutional Neural Network For Robust Speech Recognition
    Naderi, Navid
    Nasersharif, Babak
    [J]. 2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2017, : 1459 - 1464
  • [7] Efficient GPU implementation of convolutional neural networks for speech recognition
    van den Berg, Ewout
    Brand, Daniel
    Bordawekar, Rajesh
    Rachevsky, Leonid
    Ramabhadran, Bhuvana
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1483 - 1487
  • [8] A Method of Speech Coding for Speech Recognition Using a Convolutional Neural Network
    Kubanek, Mariusz
    Bobulski, Janusz
    Kulawik, Joanna
    [J]. SYMMETRY-BASEL, 2019, 11 (09): : 1 - 12
  • [9] Automatic Speech Recognition trained with Convolutional Neural Network and predicted with Recurrent Neural Network
    Soundarya, M.
    Karthikeyan, P. R.
    Thangarasu, Gunasekar
    [J]. 2023 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENERGY SYSTEMS, ICEES, 2023, : 41 - 45
  • [10] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34