Performance Optimization of Speech Recognition System with Deep Neural Network Model

被引：0

作者：

Wei Guan ^{[1
]}

机构：

[1] College of Modern Science and Technology, China Jiliang University, HangzhouZhejiang

来源：

Optical Memory and Neural Networks | 2018年 / 27卷 / 4期

关键词：

acoustic model; deep neural network; discriminative training; performance optimization; speech recognition;

D O I：

10.3103/S1060992X18040094

中图分类号：

学科分类号：

摘要：

Abstract: With the development of internet, man-machine interaction has tended to be more important. Precise speech recognition has become an important means to achieve man-machine interaction. In this study, deep neural network model was used to enhance speech recognition performance. Feedforward fully connected deep neural network, time-delay neural network, convolutional neural network and feedforward sequence memory neural network were studied, and their speech recognition performance was studied by comparing their acoustic models. Moreover, the recognition performance of the model after adding different dimension human voice features was tested. The results showed that the performance of the speech recognition system could be improved effectively by using the deep neural network model, and the performance of feedforward sequence memory neural network was the best, followed by deep neural network, time-delay neural network and convolutional neural network. Different extraction features had different improvement effects on model performance. The performance of the model which was added with Fbank extraction features was superior to that added with Mel-frequency cepstrum coefficient (MFCC) extraction feature. The model performance improved after the addition of vocal characteristics. Different models had different vocal characteristic dimensions. © 2018, Allerton Press, Inc.

引用

页码：272 / 282

页数：10

共 50 条

[1] Language Model Optimization for a Deep Neural Network Based Speech Recognition System for Serbian
Pakoci, Edvin
Popovic, Branislav
Pekar, Darko
SPEECH AND COMPUTER, SPECOM 2017, 2017, 10458 : 483 - 492
[2] Evaluation of Modified Deep Neural Network Architecture Performance for Speech Recognition
Haque, Md Amaan
Alex, John Sahaya Rani
Venkatesan, Nithya
2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEM (ICIAS 2018) / WORLD ENGINEERING, SCIENCE & TECHNOLOGY CONGRESS (ESTCON), 2018,
[3] Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition
Dehghani, Arash
Seyyedsalehi, Seyyed Ali
2018 25TH IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING AND 2018 3RD INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME), 2018, : 240 - 245
[4] Speech Recognition Model for Assamese Language Using Deep Neural Network
Singh, Moirangthem Tiken
Barman, Partha Pratim
Gogoi, Rupjyoti
2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2722 - 2727
[5] A Multi-Region Deep Neural Network Model in Speech Recognition
Cui, Jia
Saon, George
Ramabhadran, Bhuvana
Kingsbury, Brian
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3244 - 3248
[6] Stimulated Deep Neural Network for Speech Recognition
Wu, Chunyang
Karanasou, Penny
Gales, Mark J. F.
Sim, Khe Chai
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 400 - 404
[7] Deep Belief Network Optimization in Speech Recognition
Prasetio, Murman Dwi
Hayashida, Tomohiro
Nishizaki, Ichiro
Sekizaki, Shinya
2017 INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET), 2017, : 138 - 143
[8] BILINGUAL SPEECH RECOGNITION SYSTEM FOR ISOLATED WORDS USING DEEP NEURAL NETWORK
Bharathi, B.
Kavitha, S.
Sugapriya, S.
2018 2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, AND SIGNAL PROCESSING (ICCCSP): SPECIAL FOCUS ON TECHNOLOGY AND INNOVATION FOR SMART ENVIRONMENT, 2018, : 78 - 81
[9] Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition
Mohammad Reza Falahzadeh
Fardad Farokhi
Ali Harimi
Reza Sabbaghi-Nadooshan
Circuits, Systems, and Signal Processing, 2023, 42 : 449 - 492
[10] Deep Convolutional Neural Network and Gray Wolf Optimization Algorithm for Speech Emotion Recognition
Falahzadeh, Mohammad Reza
Farokhi, Fardad
Harimi, Ali
Sabbaghi-Nadooshan, Reza
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (01) : 449 - 492

← 1 2 3 4 5 →