Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引:0
|
作者
Wei Guan [1 ]
机构
[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang
关键词
acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;
D O I
10.3103/S1060992X18040094
中图分类号
学科分类号
摘要
Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.
引用
收藏
页码:272 / 282
页数:10
相关论文
共 50 条
  • [1] Language Model Optimization for a Deep Neural Network Based Speech Recognition System for Serbian
    Pakoci, Edvin
    Popovic, Branislav
    Pekar, Darko
    SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 483 - 492
  • [2] Evaluation of Modified Deep Neural Network Architecture Performance for Speech Recognition
    Haque, Md Amaan
    Alex, John Sahaya Rani
    Venkatesan, Nithya
    2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEM (ICIAS 2018) / WORLD ENGINEERING, SCIENCE & TECHNOLOGY CONGRESS (ESTCON), 2018,
  • [3] Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition
    Dehghani, Arash
    Seyyedsalehi, Seyyed Ali
    2018 25TH IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING AND 2018 3RD INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2018, : 240 - 245
  • [4] Speech Recognition Model for Assamese Language Using Deep Neural Network
    Singh, Moirangthem Tiken
    Barman, Partha Pratim
    Gogoi, Rupjyoti
    2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
  • [5] A Multi-Region Deep Neural Network Model in Speech Recognition
    Cui, Jia
    Saon, George
    Ramabhadran, Bhuvana
    Kingsbury, Brian
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3244 - 3248
  • [6] Stimulated Deep Neural Network for Speech Recognition
    Wu, Chunyang
    Karanasou, Penny
    Gales, Mark J. F.
    Sim, Khe Chai
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 400 - 404
  • [7] Deep Belief Network Optimization in Speech Recognition
    Prasetio, Murman Dwi
    Hayashida, Tomohiro
    Nishizaki, Ichiro
    Sekizaki, Shinya
    2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 138 - 143
  • [8] BILINGUAL SPEECH RECOGNITION SYSTEM FOR ISOLATED WORDS USING DEEP NEURAL NETWORK
    Bharathi, B.
    Kavitha, S.
    Sugapriya, S.
    2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 78 - 81
  • [9] Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition
    Mohammad Reza Falahzadeh
    Fardad Farokhi
    Ali Harimi
    Reza Sabbaghi-Nadooshan
    Circuits, Systems, and Signal Processing, 2023, 42 : 449 - 492
  • [10] Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition
    Falahzadeh, Mohammad Reza
    Farokhi, Fardad
    Harimi, Ali
    Sabbaghi-Nadooshan, Reza
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 449 - 492