DEEP RECURRENT REGULARIZATION NEURAL NETWORK FOR SPEECH RECOGNITION

被引:0
|
作者
Chien, Jen-Tzung [1 ]
Lu, Tsai-Wei [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Elect & Comp Engn, Hsinchu 30010, Taiwan
关键词
Recurrent neural network; model regularization; deep learning; acoustic model;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a deep recurrent regularization neural network (DRRNN) for speech recognition. Our idea is to build a regularization neural network acoustic model by conducting the hybrid Tikhonov and weight-decay regularization which compensates the variations due to the input speech as well as the model parameters in the restricted Boltzmann machine as a pre-training stage for feature learning and structural modeling. In addition, a new backpropagation through time (BPTT) algorithm is developed by extending the truncated minibatch training for recurrent neural network where the minibatch BPTT is not only performed in recurrent layer but also in feedforward layer. The DRRNN acoustic model is accordingly established to capture the temporal correlation in a regularization neural network. Experimental results on the tasks of RM and Aurora4 show the effectiveness and robustness of using DRRNN for speech recognition.
引用
收藏
页码:4560 / 4564
页数:5
相关论文
共 50 条
  • [21] Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks
    Tkachenko, Maxim
    Yamshinin, Alexander
    Lyubimov, Nikolay
    Kotov, Mikhail
    Nastasenko, Marina
    [J]. SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 690 - 699
  • [22] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    [J]. 2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165
  • [23] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda, Shashidhar
    Patilkulkarni, Sudarshan
    Ravi, Vinayakumar
    H.L., Gururaj
    Krichen, Moez
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34
  • [24] A DEEP NEURAL NETWORK INTEGRATED WITH FILTERBANK LEARNING FOR SPEECH RECOGNITION
    Seki, Hiroshi
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5480 - 5484
  • [25] Transfer Learning of Deep Neural Network for Speech Emotion Recognition
    Huang, Ying
    Hu, Mingqing
    Yu, Xianguo
    Wang, Tao
    Yang, Chen
    [J]. PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 721 - 729
  • [26] Deep neural network architectures for dysarthric speech analysis and recognition
    Brahim Fares Zaidi
    Sid Ahmed Selouani
    Malika Boudraa
    Mohammed Sidi Yakoub
    [J]. Neural Computing and Applications, 2021, 33 : 9089 - 9108
  • [27] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
    Mohanty, Aniruddha
    Cherukuri, Ravindranath C.
    Prusty, Alok Ranjan
    [J]. THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
  • [28] Deep Convolution Neural Network Based Speech Recognition for Chhattisgarhi
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    Tekchandani, Hitesh
    [J]. 2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 667 - 671
  • [29] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144
  • [30] Deep neural network architectures for dysarthric speech analysis and recognition
    Zaidi, Brahim Fares
    Selouani, Sid Ahmed
    Boudraa, Malika
    Sidi Yakoub, Mohammed
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9089 - 9108