Effect of Articulatory Δ and ΔΔ Parameters on Multilayer Neural Network based Speech Recognition

被引:0
|
作者
Banik, Manoj [1 ]
Kotwal, Mohammed Rokibul Alam [2 ]
Hassan, Foyzul [3 ]
Islam, Gazi Md. Moshfiqul [2 ]
Rahman, Sharif Mohammad Musfiqur [2 ]
Hasan, Mohammad Mahedi [2 ,4 ]
Muhammad, Ghulam [5 ]
Huda, Mohammad Nurul [2 ]
机构
[1] Ahsanullah Univ Sci & Technol, Dept CSE, Dhaka, Bangladesh
[2] United Int Univ, Dept CSE, Dhaka, Bangladesh
[3] Enosis Solut, Dhaka, Bangladesh
[4] Blueliner Bangladesh, Dhaka, Bangladesh
[5] King Saud Univ, Coll CIS, Dept CE, Riyadh, Saudi Arabia
关键词
Distinctive Phonetic Features; Multi-Layer Neural Network; Local Features; Dynamic Parameters; Hidden Markov Models;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper describes an effect of articulatory dynamic parameters (Delta and Delta Delta) on neural network based automatic speech recognition(ASR). Articulatory features (AFs) or distinctive phonetic features (DPFs)-based system shows its superiority in performances over acoustic features- based in ASR. These performances can be further improved by incorporating articulatory dynamic parameters into it. In this paper, we have proposed such a phoneme recognition system that comprises three stages: (i) DPFs extraction using a multilayer neural network (MLN) from acoustic features, (ii) incorporation of dynamic parameters into another MLN for reducing DPF context, and (iii) addition of an Inhibition/Enhancement (In/En) network for categorizing the DPF movement more accurately and Gram-Schmidt (GS) orthogonalization procedure for decorrelating the inhibited/enhanced data vector before connecting with hidden Markov model (HMMs)-based classifier. From the experiments on Japanese Newspaper Article Sentences (JNAS), it is observed that the proposed method provides a higher phoneme correct rate over the method that does not incorporate dynamic articulatory parameters. Moreover, it reduces mixture components in HMM for obtaining a higher recognition performance.
引用
收藏
页码:624 / 627
页数:4
相关论文
共 50 条
  • [21] Speech Emotion Recognition Based on Deep Neural Network
    Zhu, Zijiang
    Hu, Yi
    Li, Junshan
    Li, Jianjun
    Wang, Junhua
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154
  • [22] Speech Enhancement Method Based On LSTM Neural Network for Speech Recognition
    Liu, Ming
    Wang, Yujun
    Wang, Jin
    Wang, Jing
    Xie, Xiang
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 245 - 249
  • [23] Deep Neural Network Based Speech Separation for Robust Speech Recognition
    Tu Yanhui
    Jun, Du
    Xu Yong
    Dai Lirong
    Chin-Hui, Lee
    [J]. 2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 532 - 536
  • [24] Radar Signal Recognition Based on Multilayer Perceptron Neural Network
    Chilukuri, Raja Kumari
    Kakarla, Hari Kishore
    Rao, Subba
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (01) : 29 - 36
  • [25] ARTICULATORY FEATURES FROM DEEP NEURAL NETWORKS AND THEIR ROLE IN SPEECH RECOGNITION
    Mitra, Vikramjit
    Sivaraman, Ganesh
    Nam, Hosung
    Espy-Wilson, Carol
    Saltzman, Elliot
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [26] Acoustic and articulatory feature based speech rate estimation using a convolutional dense neural network
    Mannem, Renuka
    Mallela, Jhansi
    Illa, Aravind
    Ghosh, Prasanta Kumar
    [J]. INTERSPEECH 2019, 2019, : 929 - 933
  • [27] Neural Network Based Recognition of Speech Using MFCC Features
    Barua, Pialy
    Ahmad, Kanij
    Khan, Ainul Anam Shahjamal
    Sanaullah, Muhammad
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2014,
  • [28] Audiovisual speech recognition based on a deep convolutional neural network
    Rudregowda S.
    Patilkulkarni S.
    Ravi V.
    H.L. G.
    Krichen M.
    [J]. Data Science and Management, 2024, 7 (01): : 25 - 34
  • [29] A Neural Network Based Nonlinear Feature Transformation for Speech Recognition
    Hu, Hongbing
    Zahorian, Stephen A.
    [J]. INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1533 - +
  • [30] WAVELET BASED CEPSTRAL COEFFICIENTS FOR NEURAL NETWORK SPEECH RECOGNITION
    Adam, T. B.
    Salam, M. S.
    Gunawan, T. S.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING APPLICATIONS (IEEE ICSIPA 2013), 2013, : 447 - 451