Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition

被引:0
|
作者
Yapanel, UH [1 ]
Dharanipragada, S [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Boulder, CO 80309 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a robust feature extraction technique for continuous speech recognition. Central to the technique is the Minimum Variance Distortionless Response (MVDR) method of spectrum estimation. We incorporate perceptual information directly in to the spectrum estimation. This provides improved robustness and computational efficiency when compared with the previously proposed MVDR-MFCC technique [10]. On an in-car speech recognition task this method, which we refer to as PMCC is 15% more accurate in WER and requires approximately a factor of 4 times less computation than the MVDR-MFCC technique. On the same task PMCC yields 20% relative improvement over MFCC and 11% relative improvement over PLP frontends. Similar improvements are observed on the Aurora 2 database.
引用
收藏
页码:644 / 647
页数:4
相关论文
共 50 条
  • [21] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, Shingo
    Hayasaka, Noboru
    Wada, Naoya
    Miyanaga, Yoshikazu
    [J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 1600, (I209-I212):
  • [22] Improved DTW Speech Recognition Algorithm Based on the MEL Frequency Cepstral Coefficients
    Wei Ming-zhe
    Li Xi
    Ren Li-mian
    [J]. 12TH ANNUAL MEETING OF CHINA ASSOCIATION FOR SCIENCE AND TECHNOLOGY ON INFORMATION AND COMMUNICATION TECHNOLOGY AND SMART GRID, 2010, : 235 - 238
  • [23] Cepstral shape normalization (CSN) for robust speech recognition
    Du, Jun
    Wang, Ren-Hua
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4389 - 4392
  • [24] Feature Extraction Based on DCT and MVDR Spectral Estimation for Robust Speech Recognition
    Seyedin, Sanaz
    Ahadi, Mohammad
    [J]. ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 605 - 608
  • [25] PARAMETRIC CEPSTRAL MEAN NORMALIZATION FOR ROBUST SPEECH RECOGNITION
    Kalinli, Ozlem
    Bhattacharya, Gautam
    Weng, Chao
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6735 - 6739
  • [26] Sparse Kernel Cepstral Coefficients (SKCC): Inner-product based Features for Noise-Robust Speech Recognition
    Fazel, Amin
    Chakrabartty, Shantanu
    [J]. 2011 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2011, : 2401 - 2404
  • [27] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [28] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [29] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
  • [30] Linear Frequency Residual Cepstral Coefficients for Speech Emotion Recognition
    Hora, Baveet Singh
    Uthiraa, S.
    Patil, Hemant A.
    [J]. SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 116 - 129