A robust speech detection algorithm for speech activated hands-free applications

被引:2
|
作者
Wu, D [1 ]
Tanaka, M [1 ]
Chen, R [1 ]
Olorenshaw, L [1 ]
Amador, M [1 ]
Menendez-Pidal, X [1 ]
机构
[1] Sony US Res Labs, Spoken Language Technol, San Jose, CA 95134 USA
关键词
D O I
10.1109/ICASSP.1999.758424
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper describes a novel noise robust speech detection algorithm that can operate reliably in severe car noisy conditions. High performance has been obtained with the following techniques: (1) noise suppression based on principal component analysis for pre-processing, (2) robust endpoint detection using dynamic parameters [1] and (3) speech verification using periodicity of voiced signals with harmonic enhancement. Noise suppression improves the SNR as compared with nonlinear spectrum subtraction by about 20 dB. This makes the endpoint detection operate reliably in SNRs down to -10 dB. In car environments, road bump noises are problematic for speech detectors causing mis-detection errors. Speech verification helps to remove these errors. This technology is being used in Sony car navigation products.
引用
收藏
页码:2407 / 2410
页数:4
相关论文
共 50 条
  • [1] A noise robust speech activity detection algorithm for voice activated hands-free
    Bagur, H
    [J]. Seventh IASTED International Conference on Signal and Image Processing, 2005, : 1 - 5
  • [2] Adaptive pitch-based speech detection for hands-free applications
    Abu-El-Quran, AR
    Goubran, RA
    [J]. 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 305 - 308
  • [3] Speech enhancement for hands-free terminals
    Grbic, N
    Nordholm, S
    Johansson, A
    [J]. ISPA 2001: PROCEEDINGS OF THE 2ND INTERNATIONAL SYMPOSIUM ON IMAGE AND SIGNAL PROCESSING AND ANALYSIS, 2001, : 435 - 440
  • [4] Microphone Array Beampattern Characterization for Hands-free Speech Applications
    Taghizadeh, Mohammad J.
    Garner, Philip N.
    Bourlard, Herve
    [J]. 2012 IEEE 7TH SENSOR ARRAY AND MULTICHANNEL SIGNAL PROCESSING WORKSHOP (SAM), 2012, : 465 - 468
  • [5] Speech recognizer-based microphone array processing for robust hands-free speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 897 - 900
  • [6] Likelihood-maximizing beamforming for robust hands-free speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2004, 12 (05): : 489 - 498
  • [7] Fast dereverberation for hands-free speech recognition
    Gomez, Randy
    Even, Jani
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. 2008 HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS, 2008, : 141 - +
  • [8] Sector-based detection for hands-free speech enhancement in cars
    Lathoud, Guillaume
    Bourgeois, Julien
    Freudenberger, Juergen
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2006, 2006 (1)
  • [9] Sector-Based Detection for Hands-Free Speech Enhancement in Cars
    Guillaume Lathoud
    Julien Bourgeois
    Jürgen Freudenberger
    [J]. EURASIP Journal on Advances in Signal Processing, 2006
  • [10] Real-time audio signal enhancement for hands-free speech applications
    Feher, Thomas
    Freitag, Michael
    Gruber, Christian
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1246 - 1250