Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [21] GEO-LOCATION DEPENDENT DEEP NEURAL NETWORK ACOUSTIC MODEL FOR SPEECH RECOGNITION
    Ye, Guoli
    Liu, Chaojun
    Gong, Yifan
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5870 - 5874
  • [22] Audio-Visual (Multimodal) Speech Recognition System Using Deep Neural Network
    Paulin, Hebsibah
    Milton, R. S.
    JanakiRaman, S.
    Chandraprabha, K.
    JOURNAL OF TESTING AND EVALUATION, 2019, 47 (06) : 3963 - 3974
  • [23] Development of Hindi speech recognition system of agricultural commodities using deep neural network
    Mandal, Partho
    Jain, Shalini
    Ojha, Gaurav
    Shukla, Anupam
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1241 - 1245
  • [24] Deep Neural Network Calibration for E2E Speech Recognition System
    Lee, Mun-Hak
    Chang, Joon-Hyuk
    INTERSPEECH 2021, 2021, : 4064 - 4068
  • [25] A Fuzzy Neural Network Based on Particle Swarm Optimization Applied in the Speech Recognition System
    Zhang, Xueying
    Sun, Ying
    Zhang, Xiaomei
    Wang, Peng
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 2, PROCEEDINGS, 2008, : 693 - 697
  • [26] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    Data Science and Management, 2024, 7 (01): : 25 - 34
  • [27] Deep neural network architectures for dysarthric speech analysis and recognition
    Brahim Fares Zaidi
    Sid Ahmed Selouani
    Malika Boudraa
    Mohammed Sidi Yakoub
    Neural Computing and Applications, 2021, 33 : 9089 - 9108
  • [28] Transfer Learning of Deep Neural Network for Speech Emotion Recognition
    Huang, Ying
    Hu, Mingqing
    Yu, Xianguo
    Wang, Tao
    Yang, Chen
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 721 - 729
  • [29] A DEEP NEURAL NETWORK INTEGRATED WITH FILTERBANK LEARNING FOR SPEECH RECOGNITION
    Seki, Hiroshi
    Yamamoto, Kazumasa
    Nakagawa, Seiichi
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5480 - 5484
  • [30] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165