Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [31] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
    Mohanty, Aniruddha
    Cherukuri, Ravindranath C.
    Prusty, Alok Ranjan
    THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
  • [32] TOWARDS STRUCTURED DEEP NEURAL NETWORK FOR AUTOMATIC SPEECH RECOGNITION
    Liao, Yi-Hsiu
    Lee, Hung-yi
    Lee, Lin-shan
    2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 137 - 144
  • [33] Deep neural network architectures for dysarthric speech analysis and recognition
    Zaidi, Brahim Fares
    Selouani, Sid Ahmed
    Boudraa, Malika
    Sidi Yakoub, Mohammed
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (15): : 9089 - 9108
  • [34] Deep Convolution Neural Network Based Speech Recognition for Chhattisgarhi
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    Tekchandani, Hitesh
    2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 667 - 671
  • [35] Speech enhancement system using deep neural network optimized with Battle Royale Optimization
    Shukla, Neeraj Kumar
    Shajin, Francis H.
    Rajendran, Radhika
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 92
  • [36] Neural network optimization using genetic algorithms for speech recognition
    Mouria-Beji, F
    ENGINEERING INTELLIGENT SYSTEMS FOR ELECTRICAL ENGINEERING AND COMMUNICATIONS, 2002, 10 (02): : 69 - 74
  • [37] Neural network optimization using genetic algorithms for speech recognition
    Mouria-Beji, Faryel
    International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications, 2002, 10 (02): : 69 - 74
  • [38] Design of Neural Network Model for Emotional Speech Recognition
    Palo, H. K.
    Mohanty, Mihir Narayana
    Chandra, Mahesh
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY ALGORITHMS IN ENGINEERING SYSTEMS, VOL 2, 2015, 325 : 291 - 300
  • [39] MCFC: A fuzzy neural network model for speech recognition
    Kuo, YH
    Hsu, JP
    Kao, CI
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 1996, 4 (04) : 257 - 268
  • [40] Neural Network Phone Duration Model for Speech Recognition
    Alumae, Tanel
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1204 - 1208