An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition

被引:4
|
作者
Raj, Bhiksha [1 ]
Turicchia, Lorenzo [2 ]
Schmidt-Nielsen, Bent [1 ]
Sarpeshkar, Rahul [2 ]
机构
[1] MERL, Cambridge, MA 02139 USA
[2] MIT, Cambridge, MA 02139 USA
关键词
Error Rate; Acoustics; Recognition Task; Recognition Performance; Auditory System;
D O I
10.1155/2007/65420
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We describe an FFT-based companding algorithm for preprocessing speech before recognition. The algorithm mimics tone-to-tone suppression and masking in the auditory system to improve automatic speech recognition performance in noise. Moreover, it is also very computationally efficient and suited to digital implementations due to its use of the FFT. In an automotive digits recognition task with the CU-Move database recorded in real environmental noise, the algorithm improves the relative word error by 12.5% at -5 dB signal-to-noise ratio (SNR) and by 6.2% across all SNRs (-5 dB SNR to +15 dB SNR). In the Aurora-2 database recorded with artificially added noise in several environments, the algorithm improves the relative word error rate in almost all situations. Copyright (C) 2007 Bhiksha Raj et al.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
    Bhiksha Raj
    Lorenzo Turicchia
    Bent Schmidt-Nielsen
    Rahul Sarpeshkar
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2007
  • [2] A companding front end for noise-robust automatic speech recognition
    Guinness, J
    Raj, B
    Schmidt-Nielsen, B
    Turicchia, L
    Sarpeshkar, R
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 249 - 252
  • [3] A noise-robust FFT-based spectrum for audio classification
    Chu, Wei
    Champagne, Benoit
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 5071 - 5074
  • [4] Efficient Noise-Robust Speech Recognition Front-End Based on the ETSI Standard
    Neves, Claudio
    Veiga, Arlindo
    Sa, Luis
    Perdigao, Fernando
    [J]. ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 609 - 612
  • [5] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [6] A noise-robust front-end for distributed speech recognition in mobile communications
    Addou, Djamel
    Selouani, Sid-Ahmed
    Kifaya, Kaoukeb
    Boudraa, Malika
    Boudraa, Bachir
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2007, 10 (04) : 167 - 173
  • [7] A noise-robust FFT-Based auditory spectrum with application in audio classification
    Chu, Wei
    Champagne, Benoit
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01): : 137 - 150
  • [8] Factorial Speech Processing Models for Noise-Robust Automatic Speech Recognition
    Khademian, Mahdi
    Homayounpour, Mohammad Mehdi
    [J]. 2015 23RD IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2015, : 637 - 642
  • [9] A noise-robust front-end based on tree-structured filter-bank for speech recognition
    Kil, RM
    Kim, YI
    Lee, GH
    [J]. IJCNN 2000: PROCEEDINGS OF THE IEEE-INNS-ENNS INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOL VI, 2000, : 81 - 86
  • [10] INCORPORATING MASK MODELLING FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION
    Koekueer, Muenevver
    Jancovic, Peter
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3929 - 3932