Discriminative auditory-based features for robust speech recognition

被引:21
|
作者
Mak, BKW [1 ]
Tam, YC
Li, PQ
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Li Creat Technol Inc, New Providence, NJ 07974 USA
来源
关键词
auditory-based filter; discriminative feature extraction; generalized probabilistic descent; minimum classification error;
D O I
10.1109/TSA.2003.819951
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, a new auditory-based feature extraction algorithm for robust speech recognition in noisy environments was proposed. The new features are derived by mimicking closely the human peripheral auditory process and the filters in the outer ear, middle ear, and inner ear are obtained from psychoacoustics literature with some manual adjustments. In this paper, we extend the auditory-based feature extraction algorithm and propose to further train the auditory-based filters through discriminative training. Using the data-driven approach, we optimize the filters by minimizing the subsequent recognition errors on a task. One significant contribution over similar efforts in the past (generally under the name of "discriminative feature extraction") is that we make no assumption on the parametric form of the auditory-based filters. Instead, we only require the filters to be triangular-like: The filter weights have a maximum value in the middle and then monotonically decrease to both ends. Discriminative training of these constrained auditory-based filters leads to improved performance. Furthermore, we study the combined discriminative training procedure for both feature and acoustic model parameters. Our experiments show that the best performance can be obtained in a sequential procedure under the unified framework of MCE/GPD.
引用
收藏
页码:27 / 36
页数:10
相关论文
共 50 条
  • [1] AN AUDITORY-BASED FEATURE FOR ROBUST SPEECH RECOGNITION
    Shao, Yang
    Jin, Zhaozhang
    Wang, DeLiang
    Srinivasan, Soundararajan
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4625 - +
  • [2] Discriminative auditory features for robust speech recognition
    Mak, B
    Tam, YC
    Li, Q
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 381 - 384
  • [3] On the relevance. of auditory-based Gabor features for deep learning in robust speech recognition
    Martinez, Angel Mario Castro
    Mallidi, Sri Harish
    Meyer, Bernd T.
    [J]. COMPUTER SPEECH AND LANGUAGE, 2017, 45 : 21 - 38
  • [4] AUDITORY FEATURES BASED ON GAMMATONE FILTERS FOR ROBUST SPEECH RECOGNITION
    Qi, Jun
    Wang, Dong
    Jiang, Yi
    Liu, Runsheng
    [J]. 2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 305 - 308
  • [5] A COMPARISON OF AUDITORY FEATURES FOR ROBUST SPEECH RECOGNITION
    Kelly, Finnian
    Harte, Naomi
    [J]. 18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010), 2010, : 1968 - 1972
  • [6] Robust classification of stop consonants using auditory-based speech processing
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 81 - 84
  • [7] Discriminative training of auditory filters of different shapes for robust speech recognition
    Mak, B
    Tam, YC
    Hsiao, R
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 45 - 48
  • [8] Robust Auditory-Based Speech Feature Extraction Using Independent Subspace Method
    Wu, Qiang
    Zhang, Liqing
    Xia, Bin
    [J]. ADVANCES IN COGNITIVE NEURODYNAMICS, PROCEEDINGS, 2008, : 405 - +
  • [9] Robust auditory-based speech processing using the average localized synchrony detection
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05): : 279 - 292
  • [10] Speech Enhancement Using Auditory-Based Transform
    Tank, Vanita Raj
    Mahajan, S. P.
    Khaparde, Arti
    Deshpande, Rahul
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,