A companding front end for noise-robust automatic speech recognition

被引:0
|
作者
Guinness, J [1 ]
Raj, B [1 ]
Schmidt-Nielsen, B [1 ]
Turicchia, L [1 ]
Sarpeshkar, R [1 ]
机构
[1] Mitsubishi Elect Res Labs, Cambridge, MA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature computation modules for automatic speech recognition (ASR) systems have long been modeled on the human auditory system. Most current ASR systems model the critical band response and equal loudness characteristics of the auditory system. It has been postulated that more detailed models of the human auditory system can lead to more noise-robust speech recognition. An auditory phenomenon that is of particular relevance to robustness is simultaneous masking, whereby dominant frequencies suppress adjacent weaker frequencies. In this paper we present a companding-based model that mimics simultaneous masking in the front end of a speech recognizer. In an automotive digits recognition task, the front end improves word error rate by 4.0% (25% relative to Mel cepstra) at -5 dB SNR at the cost of a 1.7% increase at 15 dB SNR.
引用
收藏
页码:249 / 252
页数:4
相关论文
共 50 条
  • [1] An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
    Bhiksha Raj
    Lorenzo Turicchia
    Bent Schmidt-Nielsen
    Rahul Sarpeshkar
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2007
  • [2] An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
    Raj, Bhiksha
    Turicchia, Lorenzo
    Schmidt-Nielsen, Bent
    Sarpeshkar, Rahul
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)
  • [3] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [4] A noise-robust front-end for distributed speech recognition in mobile communications
    Addou, Djamel
    Selouani, Sid-Ahmed
    Kifaya, Kaoukeb
    Boudraa, Malika
    Boudraa, Bachir
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (04) : 167 - 173
  • [5] Efficient Noise-Robust Speech Recognition Front-End Based on the ETSI Standard
    Neves, Claudio
    Veiga, Arlindo
    Sa, Luis
    Perdigao, Fernando
    [J]. ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 609 - 612
  • [6] Factorial Speech Processing Models for Noise-Robust Automatic Speech Recognition
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    [J]. 2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 637 - 642
  • [7] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Koekueer, Muenevver
    Jancovic, Peter
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932
  • [8] Empirical Mode Decomposition For Noise-Robust Automatic Speech Recognition
    Wu, Kuo-Hao
    Chen, Chia-Ping
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2074 - 2077
  • [9] Noise-robust Attention Learning for End-to-End Speech Recognition
    Higuchi, Yosuke
    Tawara, Naohiro
    Ogawa, Atsunori
    Iwata, Tomoharu
    Kobayashi, Tetsunori
    Ogawa, Tetsuji
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 311 - 315
  • [10] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    [J]. PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208