Teager Energy Cepstral Coefficients for Classification of Normal vs. Whisper Speech

被引:0
|
作者
Khoria, Kuldeep [1 ]
Kamble, Madhu R. [1 ]
Patil, Hemant A. [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India
关键词
Whispered Speech Recognition (WSR); Teager Energy Operator; Equal Error Rate (EER); Latency; VOICE CONVERSION; RECOGNITION; RECONSTRUCTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The whispered speech is quite different from natural speech in the context of nature, acoustic characteristics, and generation mechanism. In order to improve the robustness of Automatic Speech Recognition (ASR) system, it is very important to analyze the mismatched training and testing situations and propose a robust acoustic features to enhance the whisper recognition. In this paper we propose to use Teager Energy Cepstral Coefficients (TECC) which uses Teager Energy Operator (TEO) for estimating "true" total energy of the signal, i.e., the sum of kinetic and potential energies which is contradictory to the traditional signal energy approximation, which only takes kinetic energy into account, i.e., L-2 norm of the signal. In this study, experiments are performed on wTIMIT and CHAINS corpus. For wTIMIT corpus, frame-level accuracy of 92.22 % is obtained and for CHAINS corpus, it is 95.61 %. We have also estimated the performance measure of the classifier by using Matthew Correlation Coefficient (MCC), F-measure, and J-statistics. Furthermore, experiments are performed by considering latency period from a practical deployment viewpoint, and the trade-off between latency period vs. accuracy is discussed for both the corpora.
引用
收藏
页码:371 / 375
页数:5
相关论文
共 50 条
  • [21] Rayleigh modeling of teager energy operated perceptual wavelet packet coefficients for enhancing noisy speech
    Islam, Md Tauhidul
    Shahnaz, Celia
    Zhu, Wei-Ping
    Ahmad, M. Omair
    SPEECH COMMUNICATION, 2017, 86 : 64 - 74
  • [22] Automatic speech recognition based on cepstral coefficients and a Mel-based discrete energy operator
    Tolba, H
    O'Shaughnessy, D
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 973 - 976
  • [23] ON AUTOMATIC VOICE CASTING FOR EXPRESSIVE SPEECH: SPEAKER RECOGNITION VS. SPEECH CLASSIFICATION
    Obin, Nicolas
    Roebel, Axel
    Bachman, Gregoire
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [24] Research on Speech Emotion Recognition Based on Teager Energy Operator Coefficients and Inverted MFCC Feature Fusion
    Wang, Feifan
    Shen, Xizhong
    ELECTRONICS, 2023, 12 (17)
  • [25] A semisoft thresholding method based on Teager energy operation on wavelet packet coefficients for enhancing noisy speech
    Sanam, Tahsina Farah
    Shahnaz, Celia
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2013,
  • [26] A semisoft thresholding method based on Teager energy operation on wavelet packet coefficients for enhancing noisy speech
    Tahsina Farah Sanam
    Celia Shahnaz
    EURASIP Journal on Audio, Speech, and Music Processing, 2013
  • [27] Neurodynamical route to chaos aod normal speech vs. stuttering
    Skljarov, OP
    CONTROL OF OSCILLATIONS AND CHAOS, VOLS 1-3, PROCEEDINGS, 2000, : 449 - 452
  • [28] Classification of speech under stress based on features derived from the nonlinear Teager Energy Operator
    Zhou, GJ
    Hansen, JHL
    Kaiser, JF
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 549 - 552
  • [29] A JOINT EMD AND TEAGER-KAISER ENERGY APPROACH TOWARDS NORMAL AND NASAL SPEECH ANALYSIS
    De La Cruz, Chris
    Santhanam, Balu
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 429 - 433
  • [30] Algorithm for speech emotion recognition classification based on Mel-frequency Cepstral coefficients and broad learning system
    Yang, Zhiyou
    Huang, Ying
    EVOLUTIONARY INTELLIGENCE, 2022, 15 (04) : 2485 - 2494