Feature extraction using non-linear transformation for robust speech recognition on the AURORA database

被引:0
|
作者
Sharma, S [1 ]
Ellis, D [1 ]
Kajarekar, S [1 ]
Jain, P [1 ]
Hermansky, H [1 ]
机构
[1] Intel Corp, Santa Clara, CA 95051 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We evaluate the performance of several feature sets on the AURORA task as defined by ETSI. We show that after a non-linear transformation, a number of features can be effectively used in a HMM-based recognition system. The non-linear transformation is computed using a neural network which is discriminatively trained on the phonetically labeled (forcibly aligned) training data. A combination of the non-linearly transformed PLP, MSG and TRAP features yields a 63% improvement in error rate as compared to a baseline MFCC features. The use of the non-linearly transformed RASTA-like features, with system parameters scaled down to take into account the ETSI imposed memory and latency constraints, still yields a 40% improvement in error rate.
引用
收藏
页码:1117 / 1120
页数:4
相关论文
共 50 条
  • [21] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [22] Distinctive phonetic feature extraction for robust speech recognition
    Fukuda, T
    Yamamoto, W
    Nitta, T
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 25 - 28
  • [23] Non-Linear Predictive Vector Quantization of Feature Vectors for Distributed Speech Recognition
    Enrique Garcia, Jose
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2378 - 2381
  • [24] NON-LINEAR MAPPING FOR MUTLI-CHANNEL SPEECH SEPARATION AND ROBUST OVERLAPPING SPEECH RECOGNITION
    Li, Weifeng
    Dines, John
    Magimai-Doss, Mathew
    Bourlard, Herve
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3921 - 3924
  • [25] Combining speech enhancement and auditory feature extraction for robust speech recognition
    Kleinschmidt, M
    Tchorz, J
    Kollmeier, B
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 75 - 91
  • [26] FEATURE EXTRACTION ALGORITHM USING NEW CEPSTRAL TECHNIQUES FOR ROBUST SPEECH RECOGNITION
    Korba, Mohamed Cherif Amara
    Bourouba, Houcine
    Djemili, Rafik
    MALAYSIAN JOURNAL OF COMPUTER SCIENCE, 2020, 33 (02) : 90 - 101
  • [27] Robust speech recognition with feature extraction using combined method of RSF and DRA
    Wada, N. (wada@ice.eng.hokudai.ac.jp), ECTI, Thailand; Hokkaido University, Graduate School of Inf. Science and Technol.; Hokkaido University, 21st Century COE Program; IEEE Circuits and Systems Society; IEEE Sapporo Section (Institute of Electrical and Electronics Engineers Inc.):
  • [28] Robust speech recognition with feature extraction using combined method of RSF and DRA
    Wada, N
    Hayasaka, N
    Yoshizawa, S
    Miyanaga, Y
    IEEE INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES 2004 (ISCIT 2004), PROCEEDINGS, VOLS 1 AND 2: SMART INFO-MEDIA SYSTEMS, 2004, : 1001 - 1004
  • [29] Beyond Linear Transforms: Efficient Non-linear Dynamic Adaptation for Noise Robust Speech Recognition
    Rennie, Steven J.
    Dognin, Pierre L.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1305 - 1308
  • [30] Noise robust speech recognition using Gaussian basis functions for non-linear likelihood function approximation
    Pal, C
    Frey, B
    Kristjansson, T
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 405 - 408