Performance Evaluation of Deep Convolutional Maxout Neural Network in Speech Recognition

被引:0
|
作者
Dehghani, Arash [1 ]
Seyyedsalehi, Seyyed Ali [1 ]
机构
[1] Amirkabir Univ Technol, Dept Biomed Engn, Tehran, Iran
关键词
Deep Neurl Network; Continuous Speech Recognition; Convolutional Neural Network; Maxout Model; Rectified Linear Unit; Dropout;
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
In this paper, various structures and methods of Deep Artificial Neural Networks (DNN) will be evaluated and compared for the purpose of continuous Persian speech recognition. One of the first models of neural networks used in speech recognition applications were fully connected Neural Networks (FCNNs) and, consequently, Deep Neural Networks (DNNs). Although these models have better performance compared to GMM / HMM models, they do not have the proper structure to model local speech information. Convolutional Neural Network (CNN) is a good option for modeling the local structure of biological signals, including speech signals. Another issue that Deep Artificial Neural Networks face, is the convergence of networks on training data. The main inhibitor of convergence is the presence of local minima in the process of training. Deep Neural Network Pre-training methods, despite a large amount of computing, are powerful tools for crossing the local minima. But the use of appropriate neuronal models in the network structure seems to be a better solution to this problem. The Rectified Linear Unit neuronal model and the Maxout model are the most suitable neuronal models presented to this date. Several experiments were carried out to evaluate the performance of the methods and structures mentioned. After verifying the proper functioning of these methods, a combination of all models was implemented on FARSDAT speech database for continuous speech recognition. The results obtained from the experiments show that the combined model (CMDNN) improves the performance of ANNs in speech recognition versus the pretrained fully connected NNs with sigmoid neurons by about 3%.
引用
收藏
页码:240 / 245
页数:6
相关论文
共 50 条
  • [1] Maxout neurons for deep convolutional and LSTM neural networks in speech recognition
    Cai, Meng
    Liu, Jia
    [J]. SPEECH COMMUNICATION, 2016, 77 : 53 - 64
  • [2] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
    Dehghani, Arash
    Seyyedsalehi, Seyyed Ali
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (03) : 3205 - 3224
  • [3] DEEP MAXOUT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Cai, Meng
    Shi, Yongzhe
    Liu, Jia
    [J]. 2013 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2013, : 291 - 296
  • [4] Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
    Arash Dehghani
    Seyyed Ali Seyyedsalehi
    [J]. Neural Processing Letters, 2023, 55 : 3205 - 3224
  • [5] Deep Convolutional Neural Network for Arabic Speech Recognition
    Amari, Rafik
    Noubigh, Zouhaira
    Zrigui, Salah
    Berchech, Dhaou
    Nicolas, Henri
    Zrigui, Mounir
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 13501 : 120 - 134
  • [6] Convolutional Maxout Neural Networks for Low-Resource Speech Recognition
    Cai, Meng
    Shi, Yongzhe
    Kang, Jian
    Liu, Jia
    Su, Tengrong
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 133 - +
  • [7] Evaluation of Modified Deep Neural Network Architecture Performance for Speech Recognition
    Haque, Md Amaan
    Alex, John Sahaya Rani
    Venkatesan, Nithya
    [J]. 2018 INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEM (ICIAS 2018) / WORLD ENGINEERING, SCIENCE & TECHNOLOGY CONGRESS (ESTCON), 2018,
  • [8] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda, Shashidhar
    Patilkulkarni, Sudarshan
    Ravi, Vinayakumar
    H.L., Gururaj
    Krichen, Moez
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34
  • [9] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
    Mohanty, Aniruddha
    Cherukuri, Ravindranath C.
    Prusty, Alok Ranjan
    [J]. THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
  • [10] Convolutional Maxout Neural Networks for Speech Separation
    Hui, Like
    Cai, Meng
    Guo, Cong
    He, Liang
    Zhang, Wei-Qiang
    Liu, Jia
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2015, : 24 - 27