Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition

被引：0

作者：

Dehghani, Arash ^{[1
]}

Seyyedsalehi, Seyyed Ali ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Dept Biomed Engn, Tehran, Iran

来源：

2018 25TH IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING AND 2018 3RD INTERNATIONAL IRANIAN CONFERENCE ON BIOMEDICAL ENGINEERING (ICBME) | 2018年

关键词：

Deep Neurl Network; Continuous Speech Recognition; Convolutional Neural Network; Maxout Model; Rectified Linear Unit; Dropout;

D O I：

暂无

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

In this paper, various structures and methods of Deep Artificial Neural Networks (DNN) will be evaluated and compared for the purpose of continuous Persian speech recognition. One of the first models of neural networks used in speech recognition applications were fully connected Neural Networks (FCNNs) and, consequently, Deep Neural Networks (DNNs). Although these models have better performance compared to GMM / HMM models, they do not have the proper structure to model local speech information. Convolutional Neural Network (CNN) is a good option for modeling the local structure of biological signals, including speech signals. Another issue that Deep Artificial Neural Networks face, is the convergence of networks on training data. The main inhibitor of convergence is the presence of local minima in the process of training. Deep Neural Network Pre-training methods, despite a large amount of computing, are powerful tools for crossing the local minima. But the use of appropriate neuronal models in the network structure seems to be a better solution to this problem. The Rectified Linear Unit neuronal model and the Maxout model are the most suitable neuronal models presented to this date. Several experiments were carried out to evaluate the performance of the methods and structures mentioned. After verifying the proper functioning of these methods, a combination of all models was implemented on FARSDAT speech database for continuous speech recognition. The results obtained from the experiments show that the combined model (CMDNN) improves the performance of ANNs in speech recognition versus the pretrained fully connected NNs with sigmoid neurons by about 3%.

引用

页码：240 / 245

页数：6

共 50 条

[1] Maxout neurons for deep convolutional and LSTM neural networks in speech recognition
Cai, Meng
Liu, Jia
[J]. SPEECH COMMUNICATION, 2016, 77 : 53 - 64
[2] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Dehghani, Arash
Seyyedsalehi, Seyyed Ali
[J]. NEURAL PROCESSING LETTERS, 2023, 55 (03) : 3205 - 3224
[3] DEEP MAXOUT NEURAL NETWORKS FOR SPEECH RECOGNITION
Cai, Meng
Shi, Yongzhe
Liu, Jia
[J]. 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 291 - 296
[4] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Arash Dehghani
Seyyed Ali Seyyedsalehi
[J]. Neural Processing Letters, 2023, 55 : 3205 - 3224
[5] Deep Convolutional Neural Network for Arabic Speech Recognition
Amari, Rafik
Noubigh, Zouhaira
Zrigui, Salah
Berchech, Dhaou
Nicolas, Henri
Zrigui, Mounir
[J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
[6] Convolutional Maxout Neural Networks for Low-Resource Speech Recognition
Cai, Meng
Shi, Yongzhe
Kang, Jian
Liu, Jia
Su, Tengrong
[J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 133 - +
[7] Evaluation of Modified Deep Neural Network Architecture Performance for Speech Recognition
Haque, Md Amaan
Alex, John Sahaya Rani
Venkatesan, Nithya
[J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEM (ICIAS 2018) / WORLD ENGINEERING, SCIENCE & TECHNOLOGY CONGRESS (ESTCON), 2018,
[8] Audiovisual speech recognition based on a deep convolutional neural network
Rudregowda, Shashidhar
Patilkulkarni, Sudarshan
Ravi, Vinayakumar
H.L., Gururaj
Krichen, Moez
[J]. Data Science and Management, 2024, 7 (01): : 25 - 34
[9] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
Mohanty, Aniruddha
Cherukuri, Ravindranath C.
Prusty, Alok Ranjan
[J]. THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
[10] Convolutional Maxout Neural Networks for Speech Separation
Hui, Like
Cai, Meng
Guo, Cong
He, Liang
Zhang, Wei-Qiang
Liu, Jia
[J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2015, : 24 - 27

← 1 2 3 4 5 →