Local Feature or Mel Frequency Cepstral Coefficients - Which One Is Better for MLN-Based Bangla Speech Recognition?

被引:0
|
作者
Hassan, Foyzul [1 ]
Kotwal, Mohammed Rokibul Alam [1 ]
Rahman, Md Mostafizur [1 ]
Nasiruddin, Mohammad [2 ]
Latif, Md Abdul [2 ]
Huda, Mohammad Nurul [1 ]
机构
[1] United Int Univ, Dhaka, Bangladesh
[2] Univ Asia Pacific, Dhaka, Bangladesh
关键词
Local Feature; Mel Frequency Cepstral Coefficient; Multilayer Neural Network; Hidden Markov Model; Automatic Speech Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses the dominancy of local features (LFs), as input to the multilayer neural network (MLN), extracted from a Bangla input speech over mel frequency cepstral coefficients (MFCCs). Here, LF-based method comprises three stages: (i) LF extraction from input speech, (ii) phoneme probabilities extraction using MLN from LF and (iii) the hidden Markov model (HMM) based classifier to obtain more accurate phoneme strings. In the experiments on Bangla speech corpus prepared by us, it is observed that the LF-based automatic speech recognition (ASR) system provides higher phoneme correct rate than the MFCC-based system. Moreover, the proposed system requires fewer mixture components in the HMMs.
引用
收藏
页码:154 / +
页数:2
相关论文
共 40 条
  • [21] Fusion of mel and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN
    U. Kumaran
    S. Radha Rammohan
    Senthil Murugan Nagarajan
    A. Prathik
    International Journal of Speech Technology, 2021, 24 : 303 - 314
  • [22] Automatic Speaker Recognition Based on Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Memon, Sheeraz
    Bhatti, Sania
    Abro, Farzana Rauf
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2013, 32 (04) : 543 - 550
  • [23] Speech Based Arithmetic Calculator Using Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models
    Husain, Moula
    Meena, S. M.
    Gonal, Manjunath K.
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND INFORMATICS (ICACNI 2015), VOL 1, 2016, 43 : 209 - 218
  • [24] Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
    Vergin, R
    O'Shaughnessy, D
    Farhat, A
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (05): : 525 - 532
  • [25] Amputee walking mode recognition based on mel frequency cepstral coefficients using surface electromyography sensor
    Hussain, Tahir
    Iqbal, Nadeem
    Maqbool, Hafiz Farhan
    Khan, Mukhtaj
    Tahir, Mehak
    INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2020, 32 (03) : 139 - 149
  • [26] Research on Violin Audio Feature Recognition Based on Mel-Frequency Cepstral Coefficient-Based Feature Parameter Extraction
    Zeng, Ming
    Zeng, Huahong
    Informatica (Slovenia), 2024, 48 (19): : 1 - 6
  • [27] A Wavelet Packet and Mel-Frequency Cepstral Coefficients-Based Feature Extraction Method for Speaker Identification
    Turner, Claude
    Joseph, Anthony
    COMPLEX ADAPTIVE SYSTEMS, 2015, 2015, 61 : 416 - 421
  • [28] Cepstral and acoustic ternary pattern based hybrid feature extraction approach for end-to-end bangla speech recognition
    Dua M.
    Akanksha
    Dua S.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (12) : 16903 - 16919
  • [29] Automatic Voice Recognition System based on Multiple Support Vector Machines and Mel-Frequency Cepstral Coefficients
    Barbosa, Felipe Gomes
    Santos Silva, Washington Luis
    2015 11TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION (ICNC), 2015, : 665 - 670
  • [30] Recognition Types of Cracked Material under Uniaxial Tension Based on Improved Mel Frequency Cepstral Coefficients (MFCC)
    Yuan, Jianjian
    Shao, Hua
    Huang, Hongcheng
    2022 IEEE 5TH INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION ENGINEERING, ICECE, 2022, : 210 - 215