Local Feature or Mel Frequency Cepstral Coefficients - Which One Is Better for MLN-Based Bangla Speech Recognition?

被引:0
|
作者
Hassan, Foyzul [1 ]
Kotwal, Mohammed Rokibul Alam [1 ]
Rahman, Md Mostafizur [1 ]
Nasiruddin, Mohammad [2 ]
Latif, Md Abdul [2 ]
Huda, Mohammad Nurul [1 ]
机构
[1] United Int Univ, Dhaka, Bangladesh
[2] Univ Asia Pacific, Dhaka, Bangladesh
关键词
Local Feature; Mel Frequency Cepstral Coefficient; Multilayer Neural Network; Hidden Markov Model; Automatic Speech Recognition;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses the dominancy of local features (LFs), as input to the multilayer neural network (MLN), extracted from a Bangla input speech over mel frequency cepstral coefficients (MFCCs). Here, LF-based method comprises three stages: (i) LF extraction from input speech, (ii) phoneme probabilities extraction using MLN from LF and (iii) the hidden Markov model (HMM) based classifier to obtain more accurate phoneme strings. In the experiments on Bangla speech corpus prepared by us, it is observed that the LF-based automatic speech recognition (ASR) system provides higher phoneme correct rate than the MFCC-based system. Moreover, the proposed system requires fewer mixture components in the HMMs.
引用
收藏
页码:154 / +
页数:2
相关论文
共 37 条
  • [1] Palmprint recognition based on Mel frequency Cepstral coefficients feature extraction
    Fahmy, Maged M. M.
    [J]. AIN SHAMS ENGINEERING JOURNAL, 2010, 1 (01) : 39 - 47
  • [2] Chip design of mel frequency cepstral coefficients for speech recognition
    Wang, JC
    Wang, JF
    Weng, YS
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3658 - 3661
  • [3] Improved DTW Speech Recognition Algorithm Based on the MEL Frequency Cepstral Coefficients
    Wei Ming-zhe
    Li Xi
    Ren Li-mian
    [J]. 12TH ANNUAL MEETING OF CHINA ASSOCIATION FOR SCIENCE AND TECHNOLOGY ON INFORMATION AND COMMUNICATION TECHNOLOGY AND SMART GRID, 2010, : 235 - 238
  • [4] Combining Mel Frequency Cepstral Coefficients and Fractal Dimensions for Automatic Speech Recognition
    Ezeiza, Aitzol
    Lopez de Ipina, Karmele
    Hernandez, Carmen
    Barroso, Nora
    [J]. ADVANCES IN NONLINEAR SPEECH PROCESSING, 2011, 7015 : 183 - +
  • [5] One Solution of Extension of Mel-Frequency Cepstral Coefficients Feature Vector for Automatic Speaker Recognition
    Jokic, Ivan D.
    Jokic, Stevan D.
    Delic, Vlado D.
    Peric, Zoran H.
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2020, 49 (02): : 224 - 236
  • [6] Mel Frequency Cepstral Coefficients Based Similar Albanian Phonemes Recognition
    Karahoda, Bertan
    Pireva, Krenare
    Imran, Ali Shariq
    [J]. HUMAN INTERFACE AND THE MANAGEMENT OF INFORMATION: INFORMATION, DESIGN AND INTERACTION, PT I, 2016, 9734 : 491 - 500
  • [7] Recognition of Human Speech Emotion Using Variants of Mel-Frequency Cepstral Coefficients
    Palo, Hemanta Kumar
    Chandra, Mahesh
    Mohanty, Mihir Narayan
    [J]. ADVANCES IN SYSTEMS, CONTROL AND AUTOMATION, 2018, 442 : 491 - 498
  • [8] A New Approach for Fingerprint Recognition Based on Mel Frequency Cepstral Coefficients
    Hashad, F. G.
    Halim, T. M.
    Diab, S. M.
    Sallam, B. M.
    [J]. 2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES 2009), 2009, : 263 - +
  • [9] Emotion Recognition from Speech Signal Using Mel-Frequency Cepstral Coefficients
    Korkmaz, Onur Erdem
    Atasoy, Ayten
    [J]. 2015 9TH INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONICS ENGINEERING (ELECO), 2015, : 1254 - 1257
  • [10] Robust Acoustic Speech Feature Prediction From Noisy Mel-Frequency Cepstral Coefficients
    Milner, Ben
    Darch, Jonathan
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 338 - 347