Improved Phone Recognition Using Excitation Source Features

被引:0
|
作者
Hisham, P. M. [1 ]
Pravena, D. [1 ]
Pardhu, Y. [2 ]
Gokul, V. [2 ]
Abhitej, B. [2 ]
Govind, D. [1 ]
机构
[1] Amrita Vishwa Vidyapeetham Univ, Ctr Excellence Computat Engn & Networking, Coimbatore 641112, Tamil Nadu, India
[2] Amrita Vishwa Vidyapeetham, Dept Comp Sci & Engn, Kollam 690525, Kerala, India
关键词
D O I
10.1007/978-3-319-23036-8_13
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phone recognizers serve as the preprocessing unit for speech recognition systems and phonetic engines. Even though, most of the state of the art speech recognition achieve relatively better accuracy at the sentence level, the phone level recognition performance falls way below the sentence level performance. The increased recognition rates at the sentence levels are achieved with help of refined language models used for the language under consideration. Therefore, the objective of the present work is to improve the phoneme level accuracy of the hidden markov model(HMM) based acoustic phone models by combining excitation source features with the conventional mel frequency cepstral coefficients (MFCC) for American English. TIMIT and CMU Arctic database, is used for the experiments in the present work. The average spectral energy around the zero-frequency region of each frame is used as the excitation source feature to combine with the 13 MFCC features. The effectiveness of the phoneme recognition is confirmed by a 0.5% increase in the phone recognition accuracy against the state of the art HMM-GMM acoustic models with MFCC features.
引用
收藏
页码:147 / 152
页数:6
相关论文
共 50 条
  • [21] AN IMPROVED METHOD USING KINEMATIC FEATURES FOR ACTION RECOGNITION
    Chen, Yuanbo
    Zhao, Yanyun
    Cai, Anni
    PROCEEDINGS OF 2011 INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY AND APPLICATION, ICCTA2011, 2011, : 737 - 741
  • [22] IMPROVED SPEAKER RECOGNITION USING DCT COEFFICIENTS AS FEATURES
    McLaren, Mitchell
    Lei, Yun
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4430 - 4434
  • [23] An improved palmprint recognition system using iris features
    Laadjel, M.
    Bouridane, A.
    Nibouche, O.
    Kurugollu, F.
    Al-Maadeed, S.
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2013, 8 (03) : 253 - 263
  • [24] An improved palmprint recognition system using iris features
    M. Laadjel
    A. Bouridane
    O. Nibouche
    F. Kurugollu
    S. Al-Maadeed
    Journal of Real-Time Image Processing, 2013, 8 : 253 - 263
  • [25] Two-Stage Phone Recognition System using Articulatory and Spectral Features
    Manjunath, K. E.
    Rao, K. Sreenivasa
    Reddy, Gurunath M.
    2015 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION ENGINEERING SYSTEMS (SPACES), 2015, : 107 - 111
  • [26] Object Recognition Based on Local Features Using Camera - Equipped Mobile Phone
    Koceski, Saso
    Koceska, Natasa
    Krstev, Aleksandar
    ICT INNOVATIONS 2010, 2011, 83 : 296 - 305
  • [27] Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference
    Kadin, Sudarsana Reddy
    Gangamohan, P.
    Gangashetty, Suryakanth, V
    Alku, Paavo
    Yegnanarayana, B.
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (09) : 4459 - 4481
  • [28] Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference
    Sudarsana Reddy Kadiri
    P. Gangamohan
    Suryakanth V. Gangashetty
    Paavo Alku
    B. Yegnanarayana
    Circuits, Systems, and Signal Processing, 2020, 39 : 4459 - 4481
  • [29] Characterization and recognition of emotions from speech using excitation source information
    Krothapalli, Sreenivasa
    Koolagudi, Shashidhar
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2013, 16 (02) : 181 - 201
  • [30] Speaker change detection in casual conversations using excitation source features
    Dhananjaya, N.
    Yegnanarayana, B.
    SPEECH COMMUNICATION, 2008, 50 (02) : 153 - 161