Teager Energy Cepstral Coefficients for Classification of Normal vs. Whisper Speech

被引:0
|
作者
Khoria, Kuldeep [1 ]
Kamble, Madhu R. [1 ]
Patil, Hemant A. [1 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Gandhinagar, Gujarat, India
关键词
Whispered Speech Recognition (WSR); Teager Energy Operator; Equal Error Rate (EER); Latency; VOICE CONVERSION; RECOGNITION; RECONSTRUCTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The whispered speech is quite different from natural speech in the context of nature, acoustic characteristics, and generation mechanism. In order to improve the robustness of Automatic Speech Recognition (ASR) system, it is very important to analyze the mismatched training and testing situations and propose a robust acoustic features to enhance the whisper recognition. In this paper we propose to use Teager Energy Cepstral Coefficients (TECC) which uses Teager Energy Operator (TEO) for estimating "true" total energy of the signal, i.e., the sum of kinetic and potential energies which is contradictory to the traditional signal energy approximation, which only takes kinetic energy into account, i.e., L-2 norm of the signal. In this study, experiments are performed on wTIMIT and CHAINS corpus. For wTIMIT corpus, frame-level accuracy of 92.22 % is obtained and for CHAINS corpus, it is 95.61 %. We have also estimated the performance measure of the classifier by using Matthew Correlation Coefficient (MCC), F-measure, and J-statistics. Furthermore, experiments are performed by considering latency period from a practical deployment viewpoint, and the trade-off between latency period vs. accuracy is discussed for both the corpora.
引用
收藏
页码:371 / 375
页数:5
相关论文
共 50 条
  • [1] Teager Energy Cepstral Coefficients For Classification of Dysarthric Speech Severity-Level
    Kachhi, Aastha
    Therattil, Anand
    Patil, Ankur T.
    Sailor, Hardik B.
    Patil, Hemant A.
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1462 - 1468
  • [2] Combining Evidences from Variable Teager Energy Source and Mel Cepstral Features for Classification of Normal vs. Pathological Voices
    Patil, Hemant A.
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [3] CONSTANT Q CEPSTRAL COEFFICIENTS FOR CLASSIFICATION OF NORMAL VS. PATHOLOGICAL INFANT CRY
    Patil, Hemant A.
    Patil, Ankur T.
    Kachhi, Aastha
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7392 - 7396
  • [4] Exploiting Phase-based Features for Whisper vs. Speech Classification
    Shah, Nirmesh J.
    Shaik, M. Ali Basha
    Periyasamy, P.
    Patil, Hemant A.
    Vij, Vikram
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 21 - 25
  • [5] Data-driven Rescaled Teager Energy Cepstral Coefficients for Noise-robust Speech Recognition
    Hsu, Miau-Luan
    Chen, Chia-Ping
    2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [6] NOVEL ENHANCED TEAGER ENERGY BASED CEPSTRAL COEFFICIENTS FOR REPLAY SPOOF DETECTION
    Acharya, Rajul
    Patil, Hemant A.
    Kotta, Harsh
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 342 - 349
  • [7] CQT-Based Cepstral Features for Classification of Normal vs. Pathological Infant Cry
    Patil, Hemant A.
    Kachhi, Aastha
    Patil, Ankur T.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4713 - 4726
  • [8] The Teager-Kaiser Energy Cepstral Coefficients as an Effective Structural Health Monitoring Tool
    Civera, Marco
    Ferraris, Matteo
    Ceravolo, Rosario
    Surace, Cecilia
    Betti, Raimondo
    APPLIED SCIENCES-BASEL, 2019, 9 (23):
  • [9] Improving the potential of Enhanced Teager Energy Cepstral Coefficients (ETECC) for replay attack detection
    Patil, Ankur T.
    Acharya, Rajul
    Patil, Hemant A.
    Guido, Rodrigo Capobianco
    COMPUTER SPEECH AND LANGUAGE, 2022, 72
  • [10] CROSS-TEAGER ENERGY CEPSTRAL COEFFICIENTS FOR REPLAY SPOOF DETECTION ON VOICE ASSISTANTS
    Acharya, Rajul
    Kotta, Harsh
    Patil, Ankur T.
    Patil, Hemant A.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6364 - 6368