Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition

被引:0
|
作者
Yapanel, UH [1 ]
Dharanipragada, S [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Boulder, CO 80309 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a robust feature extraction technique for continuous speech recognition. Central to the technique is the Minimum Variance Distortionless Response (MVDR) method of spectrum estimation. We incorporate perceptual information directly in to the spectrum estimation. This provides improved robustness and computational efficiency when compared with the previously proposed MVDR-MFCC technique [10]. On an in-car speech recognition task this method, which we refer to as PMCC is 15% more accurate in WER and requires approximately a factor of 4 times less computation than the MVDR-MFCC technique. On the same task PMCC yields 20% relative improvement over MFCC and 11% relative improvement over PLP frontends. Similar improvements are observed on the Aurora 2 database.
引用
收藏
页码:644 / 647
页数:4
相关论文
共 50 条
  • [31] Chip design of mel frequency cepstral coefficients for speech recognition
    Wang, JC
    Wang, JF
    Weng, YS
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 3658 - 3661
  • [32] Recognition of emotion from speech using evolutionary cepstral coefficients
    Bakhshi, Ali
    Chalup, Stephan
    Harimi, Ali
    Mirhassani, Seyed Mostafa
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35739 - 35759
  • [33] Recognition of emotion from speech using evolutionary cepstral coefficients
    Ali Bakhshi
    Stephan Chalup
    Ali Harimi
    Seyed Mostafa Mirhassani
    [J]. Multimedia Tools and Applications, 2020, 79 : 35739 - 35759
  • [34] Data-driven Rescaled Teager Energy Cepstral Coefficients for Noise-robust Speech Recognition
    Hsu, Miau-Luan
    Chen, Chia-Ping
    [J]. 2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
  • [35] Automatic speech recognition based on cepstral coefficients and a Mel-based discrete energy operator
    Tolba, H
    O'Shaughnessy, D
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 973 - 976
  • [36] Regularized MVDR Spectrum Estimation-based Robust Feature Extractors for Speech Recognition
    Alam, Md Jahangir
    Kenny, Patrick
    O'Shaughnessy, Douglas
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 891 - 895
  • [37] Speech emotion recognition based on deep belief networks and wavelet packet cepstral coefficients
    [J]. 1600, UK Simulation Society, Clifton Lane, Nottingham, NG11 8NS, United Kingdom (17):
  • [38] Robust Underwater Target Recognition Using Auditory Cepstral Coefficients
    Wu, Yaozhen
    Yang, Yixin
    Tao, Can
    Tian, Feng
    Yang, Long
    [J]. OCEANS 2014 - TAIPEI, 2014,
  • [39] Bounded cepstral marginalization of missing data for robust speech recognition
    Kafoori, Kian Ebrahim
    Ahadi, Seyed Mohammad
    [J]. COMPUTER SPEECH AND LANGUAGE, 2016, 36 : 1 - 23
  • [40] Cepstral amplitude range normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (08): : 2130 - 2137