Auditory-model based robust feature selection for speech recognition

被引:11
|
作者
Koniaris, Christos [1 ]
Kuropatwinski, Marcin [1 ]
Kleijn, W. Bastiaan [1 ]
机构
[1] Royal Inst Technol, KTH, Sch Elect Engn, Sound & Image Proc Lab, SE-10044 Stockholm, Sweden
来源
关键词
feature extraction; hearing; speech recognition;
D O I
10.1121/1.3284545
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
It is shown that robust dimension-reduction of a feature set for speech recognition can be based on a model of the human auditory system. Whereas conventional methods optimize classification performance, the proposed method exploits knowledge implicit in the auditory periphery, inheriting its robustness. Features are selected to maximize the similarity of the Euclidean geometry of the feature domain and the perceptual domain. Recognition experiments using mel-frequency cepstral coefficients (MFCCs) confirm the effectiveness of the approach, which does not require labeled training data. For noisy data the method outperforms commonly used discriminant-analysis based dimension-reduction methods that rely on labeling. The results indicate that selecting MFCCs in their natural order results in subsets with good performance.
引用
收藏
页码:EL73 / EL79
页数:7
相关论文
共 50 条
  • [31] DTW-based feature selection for speech recognition and speaker recognition
    Liu, Jing-Wei
    Xu, Mei-Zhi
    Zheng, Zhong-Guo
    Cheng, Qian-Sheng
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2005, 18 (01): : 50 - 54
  • [32] Two-stage model-based feature compensation for robust speech recognition
    Shen, Haifeng
    Liu, Gang
    Guo, Jun
    COMPUTING, 2012, 94 (01) : 1 - 20
  • [33] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [34] Auditory-modeling inspired methods of feature extraction for robust automatic speech recognition
    Jing, ZN
    Hasegawa-Johnson, M
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4176 - 4176
  • [35] A model of dynamic auditory perception and its application to robust speech recognition
    Strope, B
    Alwan, A
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 37 - 40
  • [36] Auditory model for robust speech recognition in real world noisy environments
    Kim, DS
    Lee, SY
    Kil, RM
    Zhu, XL
    ELECTRONICS LETTERS, 1997, 33 (01) : 12 - 13
  • [37] A robust speech recognition based on the feature of weighting combination ZCPA
    Zhang, Xueying
    Liang, Wuzhou
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 3, PROCEEDINGS, 2006, : 361 - +
  • [38] Double Gaussian based feature normalization for robust speech recognition
    Liu, B
    Dai, LR
    Li, JY
    Wang, RH
    2004 INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2004, : 253 - 256
  • [39] A Multichannel Feature-Based Processing for Robust Speech Recognition
    Souden, Mehrez
    Kinoshita, Keisuke
    Delcroix, Marc
    Nakatani, Tomohiro
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 696 - 699
  • [40] Speech feature extraction based on wavelet modulation scale for robust speech recognition
    Ma, Xin
    Zhou, Weidong
    Ju, Fang
    Jiang, Qi
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 499 - 505