SNR Features for Automatic Speech Recognition

被引:1
|
作者
Garner, Philip N. [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
关键词
NOISE; ENHANCEMENT; SUPPRESSION;
D O I
10.1109/ASRU.2009.5372895
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
When combined with cepstral normalisation techniques, the features normally used in Automatic Speech Recognition are based on Signal to Noise Ratio (SNR). We show that calculating SNR from the outset, rather than relying on cepstral normalisation to produce it, gives features with a number of practical and mathematical advantages over power-spectral based ones. In a detailed analysis, we derive Maximum Likelihood and Maximum a-Posteriori estimates for SNR based features, and show that they can outperform more conventional ones, especially when subsequently combined with cepstral variance normalisation. We further show anecdotal evidence that SNR based features lend themselves well to noise estimates based on low-energy envelope tracking.
引用
收藏
页码:182 / 187
页数:6
相关论文
共 50 条
  • [1] Topological invariants as speech features for automatic speech recognition
    Kacur, Juraj
    Chudy, Vladimir
    [J]. INTERNATIONAL JOURNAL OF SIGNAL AND IMAGING SYSTEMS ENGINEERING, 2014, 7 (04) : 235 - 244
  • [2] ADAPTIVE BOOSTING FEATURES FOR AUTOMATIC SPEECH RECOGNITION
    Kham Nguyen
    Ng, Tim
    Long Nguyen
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4733 - 4736
  • [3] ADAPTIVE BOOSTING FEATURES FOR AUTOMATIC SPEECH RECOGNITION
    Kham Nguyen
    Ng, Tim
    Long Nguyen
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4733 - 4736
  • [4] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [5] Speaker-Invariant Features for Automatic Speech Recognition
    Umesh, S.
    Sanand, D. R.
    Praveen, G.
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1738 - 1743
  • [6] Perceptual features for automatic speech recognition in noisy environments
    Haque, Serajul
    Togneri, Roberto
    Zaknich, Anthony
    [J]. SPEECH COMMUNICATION, 2009, 51 (01) : 58 - 75
  • [7] Phonetic Features Enhancement for Bangla Automatic Speech Recognition
    Kabir, Sharif M. Rasel
    Hassan, Foyzul
    Ahamed, Foysal
    Mamun, Khondokar
    Huda, Mohammad Nurul
    Nusrat, Fariha
    [J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION ENGINEERING (ICCIE), 2015, : 25 - 28
  • [8] A STUDY ON ROBUSTNESS OF ARTICULATORY FEATURES FOR AUTOMATIC SPEECH RECOGNITION OF NEUTRAL AND WHISPERED SPEECH
    Srinivasan, Gokul
    Illa, Aravind
    Ghosh, Prasanta Kumar
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5936 - 5940
  • [9] Vocal tract length invariant features for automatic speech recognition
    Mertins, A
    Rademacher, J
    [J]. 2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2005, : 308 - 312
  • [10] ROBUST EXCITATION-BASED FEATURES FOR AUTOMATIC SPEECH RECOGNITION
    Drugman, Thomas
    Stylianou, Yannis
    Chen, Langzhou
    Chen, Xie
    Gales, Mark J. F.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4664 - 4668