Perceptual MVDR-based cepstral coefficients (PMCCs) for robust speech recognition

被引:0
|
作者
Yapanel, UH [1 ]
Dharanipragada, S [1 ]
机构
[1] Univ Colorado, Ctr Spoken Language Res, Boulder, CO 80309 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a robust feature extraction technique for continuous speech recognition. Central to the technique is the Minimum Variance Distortionless Response (MVDR) method of spectrum estimation. We incorporate perceptual information directly in to the spectrum estimation. This provides improved robustness and computational efficiency when compared with the previously proposed MVDR-MFCC technique [10]. On an in-car speech recognition task this method, which we refer to as PMCC is 15% more accurate in WER and requires approximately a factor of 4 times less computation than the MVDR-MFCC technique. On the same task PMCC yields 20% relative improvement over MFCC and 11% relative improvement over PLP frontends. Similar improvements are observed on the Aurora 2 database.
引用
收藏
页码:644 / 647
页数:4
相关论文
共 50 条
  • [41] A Cepstral PDF Normalization Method for Noise Robust Speech Recognition
    Suk, Yong Ho
    Choi, Seung Ho
    [J]. ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 34 - +
  • [42] CEPSTRAL DOMAIN TALKER STRESS COMPENSATION FOR ROBUST SPEECH RECOGNITION
    CHEN, YN
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1988, 36 (04): : 433 - 439
  • [43] Combined Waveform-Cepstral Representation for Robust Speech Recognition
    Ager, Matthew
    Cvetkovic, Zoran
    Sollich, Peter
    [J]. 2011 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY PROCEEDINGS (ISIT), 2011, : 864 - 868
  • [44] Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition
    Squartini, Stefano
    Fagiani, Marco
    Principi, Emanuele
    Piazza, Francesco
    [J]. NEURAL NETS WIRN10, 2011, 226 : 284 - 292
  • [45] Cepstral Distance and Log Energy Based Silence Feature Normalization for Robust Speech Recognition
    Shen, Guanghu
    Chung, Hyun-Yeol
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2010, 29 (04): : 278 - 285
  • [46] Noise robust Chinese speech recognition using feature vector normalization and higher-order cepstral coefficients
    Wang, X
    Dong, Y
    Häkkinen, J
    Viikki, O
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 738 - 741
  • [47] Perceptual wavelet filtering for robust speech recognition
    Van Pham, Tuan
    Stark, Michael
    Kubin, Gernot
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4385 - 4388
  • [48] Spectral peak-weighted liftering of cepstral coefficients for speech recognition
    Kim, HK
    Lee, HS
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (07) : 1540 - 1549
  • [49] Enhanced Automatic Speech Recognition System Based on Enhancing Power-Normalized Cepstral Coefficients
    Tamazin, Mohamed
    Gouda, Ahmed
    Khedr, Mohamed
    [J]. APPLIED SCIENCES-BASEL, 2019, 9 (10):
  • [50] Power Normalized Gammachirp Cepstral (PNGC) coefficients-based approach for robust speaker recognition
    Zouhir, Youssef
    Zarka, Mohamed
    Supervision, Kais Ouni
    [J]. APPLIED ACOUSTICS, 2023, 205