POWER-NORMALIZED CEPSTRAL COEFFICIENTS (PNCC) FOR ROBUST SPEECH RECOGNITION

被引:0
|
作者
Kim, Chanwoo [1 ]
Stern, Richard M. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
Robust speech recognition; feature extraction; physiological modeling; rate-level curve; asymmetric filtering; medium-time power estimation; temporal masking; modulation filtering; on-line speech processing;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents a new feature extraction algorithm called Power Normalized Cepstral Coefficients (PNCC) that is based on auditory processing. Major new features of PNCC processing include the use of a power-law nonlinearity that replaces the traditional log nonlinearity used in MFCC coefficients, a noise-suppression algorithm based on asymmetric filtering that suppress background excitation, and a module that accomplishes temporal masking. We also propose the use of medium-time power analysis, in which environmental parameters are estimated over a longer duration than is commonly used for speech, as well as frequency smoothing. Experimental results demonstrate that PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for speech in the presence of various types of additive noise and in reverberant environments, with only slightly greater computational cost than conventional MFCC processing, and without degrading the recognition accuracy that is observed while training and testing using clean speech. PNCC processing also provides better recognition accuracy in noisy environments than techniques such as Vector Taylor Series (VTS) and the ETSI Advanced Front End (AFE) while requiring much less computation. We describe an implementation of PNCC using "on-line processing" that does not require future knowledge of the input.
引用
收藏
页码:4101 / 4104
页数:4
相关论文
共 50 条
  • [1] Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
    Kim, Chanwoo
    Stern, Richard M.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (07) : 1315 - 1329
  • [2] Power-Normalized Cepstral Coefficients (PNCC) for Punjabi Automatic Speech Recognition using Phone based Modelling in HTK
    Kaur, Arshpreet
    Singh, Amitoj
    [J]. PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2016, : 372 - 375
  • [3] Enhanced Automatic Speech Recognition System Based on Enhancing Power-Normalized Cepstral Coefficients
    Tamazin, Mohamed
    Gouda, Ahmed
    Khedr, Mohamed
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (10):
  • [4] Rubost Feature for Underwater Targets Recognition Using Power-Normalized Cepstral Coefficients
    Zhang, Yifan
    Xu, Ke
    Wan, Jianwei
    [J]. PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2018, : 90 - 93
  • [5] POWER-NORMALIZED PLP (PNPLP) FEATURE FOR ROBUST SPEECH RECOGNITION
    Fan, Lichun
    Ke, Dengfeng
    Fu, Xiaoyin
    Lu, Shixiang
    Xu, Bo
    [J]. 2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 224 - 228
  • [6] Active Sonar Target Classification with Power-Normalized Cepstral Coefficients and Convolutional Neural Network
    Lee, Seungwoo
    Seo, Iksu
    Seok, Jongwon
    Kim, Yunsu
    Han, Dong Seog
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 15
  • [7] Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition
    Adiga, Aniruddha
    Magimai-Doss, Mathew
    Seelamantula, Chandra Sekhar
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [8] Damped Oscillator Cepstral Coefficients for Robust Speech Recognition
    Mitra, Vikramjit
    Franco, Horacio
    Graciarena, Martin
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 886 - 890
  • [9] Speech Intelligibility Enhancement Algorithm Based on Multi-Resolution Power-Normalized Cepstral Coefficients (MRPNCC) for Digital Hearing Aids
    Wang, Xia
    Deng, Xing
    Shen, Hongming
    Zhang, Guodong
    Zhang, Shibing
    [J]. CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2021, 126 (02): : 693 - 710
  • [10] A Noise-Robust Feature Extraction Method for Rolling Element Bearing Diagnosis: Linear Power-Normalized Cepstral Coefficients (LPNCC)
    Keunsu Kim
    Heonjun Yoon
    Byeng D. Youn
    [J]. International Journal of Precision Engineering and Manufacturing-Green Technology, 2023, 10 : 217 - 232