Combined Waveform-Cepstral Representation for Robust Speech Recognition

被引:0
|
作者
Ager, Matthew [1 ]
Cvetkovic, Zoran [2 ]
Sollich, Peter [1 ]
机构
[1] Kings Coll London, Dept Math, London, England
[2] Kings Coll London, Dept Inform, London, England
基金
英国工程与自然科学研究理事会;
关键词
Speech recognition; robustness; acoustic waveforms; hybrid representation; WORD RECOGNITION;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
High-dimensional acoustic waveform representations are studied as a front-end for noise robust automatic speech recognition using generative methods, in particular Gaussian mixture models and hidden Markov models. The proposed representations are compared with standard cepstral features on phoneme classification and recognition tasks. While lower error rates are achieved using cepstral features at very low noise levels, the acoustic waveform representations are much more robust to noise. A convex combination of acoustic waveforms and cepstral features is then considered and it achieves higher accuracy than either of the individual representations across all noise levels.
引用
收藏
页码:864 / 868
页数:5
相关论文
共 50 条
  • [1] New cepstral representation using wavelet analysis and spectral transformation for robust speech recognition
    Wassner, H
    Chollet, G
    [J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 260 - 263
  • [2] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, Shingo
    Hayasaka, Noboru
    Wada, Naoya
    Miyanaga, Yoshikazu
    [J]. ICASSP IEEE Int Conf Acoust Speech Signal Process Proc, 1600, (I209-I212):
  • [3] Cepstral shape normalization (CSN) for robust speech recognition
    Du, Jun
    Wang, Ren-Hua
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4389 - 4392
  • [4] Gammatone Wavelet Cepstral Coefficients for Robust Speech Recognition
    Adiga, Aniruddha
    Magimai-Doss, Mathew
    Seelamantula, Chandra Sekhar
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE OF IEEE REGION 10 (TENCON), 2013,
  • [5] Damped Oscillator Cepstral Coefficients for Robust Speech Recognition
    Mitra, Vikramjit
    Franco, Horacio
    Graciarena, Martin
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 886 - 890
  • [6] CEPSTRAL NOISE SUBTRACTION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
    Rehr, Robert
    Gerkmann, Timo
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 375 - 378
  • [7] PARAMETRIC CEPSTRAL MEAN NORMALIZATION FOR ROBUST SPEECH RECOGNITION
    Kalinli, Ozlem
    Bhattacharya, Gautam
    Weng, Chao
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6735 - 6739
  • [8] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [9] Cepstral gain normalization for noise robust speech recognition
    Yoshizawa, S
    Hayasaka, N
    Wada, N
    Miyanaga, Y
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 209 - 212
  • [10] A combined cepstral distance method for emotional speech recognition
    Quan, Changqin
    Zhang, Bin
    Sun, Xiao
    Ren, Fuji
    [J]. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (04): : 1 - 9