Teager Energy Cepstral Coefficients for Classification of Normal vs. Whisper Speech

被引:0
|
作者
Khoria, Kuldeep [1 ]
Kamble, Madhu R. [1 ]
Patil, Hemant A. [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India
关键词
Whispered Speech Recognition (WSR); Teager Energy Operator; Equal Error Rate (EER); Latency; VOICE CONVERSION; RECOGNITION; RECONSTRUCTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The whispered speech is quite different from natural speech in the context of nature, acoustic characteristics, and generation mechanism. In order to improve the robustness of Automatic Speech Recognition (ASR) system, it is very important to analyze the mismatched training and testing situations and propose a robust acoustic features to enhance the whisper recognition. In this paper we propose to use Teager Energy Cepstral Coefficients (TECC) which uses Teager Energy Operator (TEO) for estimating "true" total energy of the signal, i.e., the sum of kinetic and potential energies which is contradictory to the traditional signal energy approximation, which only takes kinetic energy into account, i.e., L-2 norm of the signal. In this study, experiments are performed on wTIMIT and CHAINS corpus. For wTIMIT corpus, frame-level accuracy of 92.22 % is obtained and for CHAINS corpus, it is 95.61 %. We have also estimated the performance measure of the classifier by using Matthew Correlation Coefficient (MCC), F-measure, and J-statistics. Furthermore, experiments are performed by considering latency period from a practical deployment viewpoint, and the trade-off between latency period vs. accuracy is discussed for both the corpora.
引用
收藏
页码:371 / 375
页数:5
相关论文
共 50 条
  • [41] "HELLO? WHO AM I TALKING TO?" A SHALLOW CNN APPROACH FOR HUMAN VS. BOT SPEECH CLASSIFICATION
    Lieto, A.
    Moro, D.
    Devoti, F.
    Parera, C.
    Lipari, V.
    Bestagini, P.
    Tubaro, S.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2577 - 2581
  • [42] Improving the classification rate of labor vs. normal pregnancy contractions by using EHG multichannel recordings
    Hassan, M.
    Terrien, J.
    Alexandersson, A.
    Marque, C.
    Karlsson, B.
    2010 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2010, : 4642 - 4645
  • [43] Bearing faults classification using novel log energy-based empirical mode decomposition and machine Mel-frequency cepstral coefficients
    Aziz, Sumair
    Khan, Muhammad Umar
    Usman, Adil
    Faraz, Muhammad
    Ghadi, Yazeed Yasin
    Montes, Gabriel Axel
    DIGITAL SIGNAL PROCESSING, 2025, 156
  • [44] Effectiveness of Mel Scale-Based ESA-IFCC Features for Classification of Natural vs. Spoofed Speech
    Kamble, Madhu R.
    Patil, Hemant A.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 308 - 316
  • [45] Gaits Classification of Normal vs. Patients by Wireless Gait Sensor and Support Vector Machine (SVM) Classifier
    Nakano, Taro
    Nukala, B. T.
    Zupancic, Steven
    Rodriguez, Amanda
    Lie, D. Y. C.
    Lopez, J.
    Nguyen, Tam Q.
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 1164 - 1169
  • [46] Constant-Q Based Harmonic and Pitch Features for Normal vs. Pathological Infant Cry Classification
    Pusuluri, Aditya
    Kachhi, Aastha
    Patil, Hemant A.
    SPEECH AND COMPUTER, SPECOM 2023, PT II, 2023, 14339 : 407 - 420
  • [47] Energy loss by right ventricular pacing: normal left ventricular function vs. hypertrophic cardiomyopathy
    Arakawa, Y.
    Fukaya, H.
    Kakizaki, R.
    Oikawa, J.
    Matsuura, G.
    Kobayashi, S.
    Shirakawa, Y.
    Nishinarita, R.
    Horiguchi, A.
    Ishizue, N.
    Nabeta, T.
    Igarashi, G.
    Kishihara, J.
    Niwano, S.
    Ako, J.
    EUROPEAN HEART JOURNAL, 2019, 40 : 802 - 802
  • [48] Challenging the Speech Intelligibility Index: Macroscopic vs. Microscopic Prediction of Sentence Recognition in Normal and Hearing-impaired Listeners
    Juergens, Tim
    Fredelake, Stefan
    Meyer, Ralf M.
    Kollmeier, Birger
    Brand, Thomas
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2482 - 2485
  • [49] A Comparative Study of Discrete Direction vs. Continuous Distance-Based Cost Function in Energy Classification
    Khashei, Mehdi
    Etemadi, Sepideh
    Bakhtiarvand, Negar
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2023, 16 (01)
  • [50] A Comparative Study of Discrete Direction vs. Continuous Distance-Based Cost Function in Energy Classification
    Mehdi Khashei
    Sepideh Etemadi
    Negar Bakhtiarvand
    International Journal of Computational Intelligence Systems, 16