Discriminative auditory-based features for robust speech recognition

被引:21
|
作者
Mak, BKW [1 ]
Tam, YC
Li, PQ
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci, Hong Kong, Hong Kong, Peoples R China
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Li Creat Technol Inc, New Providence, NJ 07974 USA
来源
关键词
auditory-based filter; discriminative feature extraction; generalized probabilistic descent; minimum classification error;
D O I
10.1109/TSA.2003.819951
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Recently, a new auditory-based feature extraction algorithm for robust speech recognition in noisy environments was proposed. The new features are derived by mimicking closely the human peripheral auditory process and the filters in the outer ear, middle ear, and inner ear are obtained from psychoacoustics literature with some manual adjustments. In this paper, we extend the auditory-based feature extraction algorithm and propose to further train the auditory-based filters through discriminative training. Using the data-driven approach, we optimize the filters by minimizing the subsequent recognition errors on a task. One significant contribution over similar efforts in the past (generally under the name of "discriminative feature extraction") is that we make no assumption on the parametric form of the auditory-based filters. Instead, we only require the filters to be triangular-like: The filter weights have a maximum value in the middle and then monotonically decrease to both ends. Discriminative training of these constrained auditory-based filters leads to improved performance. Furthermore, we study the combined discriminative training procedure for both feature and acoustic model parameters. Our experiments show that the best performance can be obtained in a sequential procedure under the unified framework of MCE/GPD.
引用
下载
收藏
页码:27 / 36
页数:10
相关论文
共 50 条
  • [31] Feature extraction based on auditory representations for robust speech recognition
    Kim, DS
    Lee, SY
    Kil, RM
    Zhu, XL
    ELECTRONICS LETTERS, 1997, 33 (01) : 15 - 16
  • [32] Robust speech features representation based on computational auditory model
    LU Xugang JIA Chuan DANG Jianwu( Japan Advanced Institute of Science and Technology
    Chinese Journal of Acoustics, 2004, (04) : 316 - 324
  • [33] Dynamic visual features based on discriminative speech class projection for visual speech recognition
    Lei, X
    Cai, XL
    Fu, ZH
    Zhao, RC
    PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING, 2004, : 687 - 690
  • [34] Robust speech detection based on phoneme recognition features
    Mihelic, France
    Zibert, Janez
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 455 - 462
  • [35] Auditory-based speech processing based on the average localized synchrony detection
    Ali, AMA
    Van der Spiegel, J
    Mueller, P
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1623 - 1626
  • [36] Auditory-based distortion measure with application to concatenative speech synthesis
    Duke Univ, Durham, United States
    IEEE Trans Speech Audio Process, 5 (489-495):
  • [37] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [38] A discriminative and robust training algorithm for noisy speech recognition
    Hong, WT
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 8 - 11
  • [39] Auditory contrast spectrum for robust speech recognition
    Lu, Xugang
    Dang, Jianwu
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 325 - +
  • [40] Auditory-model based robust feature selection for speech recognition
    Koniaris, Christos
    Kuropatwinski, Marcin
    Kleijn, W. Bastiaan
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2010, 127 (02): : EL73 - EL79