A model distance maximizing framework for speech recognizer-based speech enhancement

被引:3
|
作者
BabaAli, Bagher [1 ]
Sameti, Hossein [1 ]
Falk, Tiago H. [2 ]
机构
[1] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[2] Univ Toronto, Bloorview Res Inst, Toronto, ON, Canada
关键词
Robust speech recognition; Speech recognizer-based speech enhancement; Model distance maximizing; Spectral subtraction; SPECTRAL SUBTRACTION; NOISE; COMPENSATION;
D O I
10.1016/j.aeue.2010.02.002
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper has presented a novel discriminative parameter calibration approach based on the model distance maximizing (MDM) framework to improve the performance of our previously-proposed method based on spectral subtraction (SS) in a likelihood-maximizing framework. In the previous work, spectral over-subtraction factors were adjusted based on the conventional maximum-likelihood (ML) approach that utilized only the true model and did not consider other confused models, thus likely reached suboptimal solutions. While in the proposed MDM framework, improved speech recognition performance is obtained by maximizing the dissimilarities among models. Experimental results based on FARSDAT, TIMIT and real distant-talking databases have demonstrated that the MDM framework outperformed ML in terms of recognition accuracy. (C) 2010 Elsevier GmbH. All rights reserved.
引用
收藏
页码:99 / 106
页数:8
相关论文
共 50 条
  • [1] Speech recognizer-based microphone array processing for robust hands-free speech recognition
    Seltzer, ML
    Raj, B
    Stern, RM
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 897 - 900
  • [2] Spectral subtraction in model distance maximizing framework for robust speech recognition
    BabaAli, Bagher
    Sameti, Hossein
    Safayani, Mehran
    [J]. ICSP: 2008 9TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, VOLS 1-5, PROCEEDINGS, 2008, : 627 - +
  • [3] Speech Recognizer-Based Non-Uniform Spectral Compression for Robust MFCC Feature Extraction
    Ali, Bagher Baba
    Wojcik, Waldemar
    Mamyrbayev, Orken
    Turdalyuly, Mussa
    Mekebayev, Nurbapa
    [J]. PRZEGLAD ELEKTROTECHNICZNY, 2018, 94 (06): : 90 - 93
  • [4] An Effective Speech Understanding Method with a Multiple Speech Recognizer based on Output Selection using Edit Distance
    Shimada, Kazutaka
    Horiguchi, Satomi
    Endo, Tsutomu
    [J]. PACLIC 22: PROCEEDINGS OF THE 22ND PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2008, : 350 - 357
  • [5] A speech recognizer with selectable model parameters
    Han, W
    Chan, CF
    Choy, CS
    Pun, KP
    [J]. 2005 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), VOLS 1-6, CONFERENCE PROCEEDINGS, 2005, : 5842 - 5845
  • [6] Cluster-based dynamic variance adaptation for interconnecting speech enhancement pre-processor and speech recognizer
    Delcroix, Marc
    Watanabe, Shinji
    Nakatani, Tomohiro
    Nakamura, Atsushi
    [J]. COMPUTER SPEECH AND LANGUAGE, 2013, 27 (01): : 350 - 368
  • [7] Speech enhancement based on a voiced-unvoiced speech model
    Goh, Z
    Tan, KC
    Tan, BTG
    [J]. PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 401 - 404
  • [8] A Microphone Array Beamformer for the Performance Enhancement of Speech Recognizer in Car
    Han, Chul-Hee
    Kang, Hong-Goo
    Hwang, Youngsoo
    Youn, Dae-Hee
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2005, 24 (07): : 423 - 430
  • [9] SPEECH ENHANCEMENT BASED ON A SINUSOIDAL MODEL
    KATES, JM
    [J]. JOURNAL OF SPEECH AND HEARING RESEARCH, 1994, 37 (02): : 449 - 464
  • [10] Speech recognizer based maximum likelihood beamforming
    Raj, B
    Seltzer, M
    Reyes-Gomez, MJ
    [J]. SPEECH SEPARATION BY HUMANS AND MACHINES, 2005, : 65 - 82