Feature extraction using non-linear transformation for robust speech recognition on the AURORA database

被引:0
|
作者
Sharma, S [1 ]
Ellis, D [1 ]
Kajarekar, S [1 ]
Jain, P [1 ]
Hermansky, H [1 ]
机构
[1] Intel Corp, Santa Clara, CA 95051 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We evaluate the performance of several feature sets on the AURORA task as defined by ETSI. We show that after a non-linear transformation, a number of features can be effectively used in a HMM-based recognition system. The non-linear transformation is computed using a neural network which is discriminatively trained on the phonetically labeled (forcibly aligned) training data. A combination of the non-linearly transformed PLP, MSG and TRAP features yields a 63% improvement in error rate as compared to a baseline MFCC features. The use of the non-linearly transformed RASTA-like features, with system parameters scaled down to take into account the ETSI imposed memory and latency constraints, still yields a 40% improvement in error rate.
引用
收藏
页码:1117 / 1120
页数:4
相关论文
共 50 条
  • [1] Non-linear feature extraction for robust speech recognition in stationary and non-stationary noise
    Zhu, QF
    Alwan, A
    [J]. COMPUTER SPEECH AND LANGUAGE, 2003, 17 (04): : 381 - 402
  • [2] Non-linear transformations of the feature space for robust speech recognition
    de la Torre, A
    Segura, JC
    Benítez, C
    Peinado, AM
    Rubio, AJ
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 401 - 404
  • [3] Kernel based non-linear feature extraction methods for speech recognition
    Huang, Hao
    Zhu, Jie
    [J]. ISDA 2006: SIXTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 2, 2006, : 749 - +
  • [4] Non-linear speech feature extraction for phoneme classification and speaker recognition
    Chetouani, M
    Faundez-Zanuy, M
    Gas, B
    Zarader, JL
    [J]. NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 344 - 350
  • [5] Non-linear techniques for robust speech recognition
    Ge, Yubo
    Niu, Jing
    Ge, Lingnan
    Shirai, Katsuhiko
    [J]. CITSA 2007/CCCT 2007: INTERNATIONAL CONFERENCE ON CYBERNETICS AND INFORMATION TECHNOLOGIES, SYSTEMS AND APPLICATIONS : INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND CONTROL TECHNOLOGIES, VOL III, POST-CONFERENCE ISSUE, PROCEEDINGS, 2007, : 134 - +
  • [6] Propagation of statistical information through non-linear feature extractions for robust speech recognition
    Astudillo, R. F.
    Kolossa, D.
    Orglmeister, R.
    [J]. BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2007, 954 : 245 - 252
  • [7] Feature extraction for robust speech recognition
    Dharanipragada, S
    [J]. 2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [8] Novel feature extraction for noise robust asr using the Aurora 2 database
    Hix, Penny
    Zahorian, Stephen
    Meng, Fansheng
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 541 - 544
  • [9] Geometrical feature extraction for robust speech recognition
    Li, Xiaokun
    Kwan, Chiman
    [J]. 2005 39TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2005, : 558 - 562
  • [10] Feature Extraction Using Linear and Non-linear Subspace Techniques
    Teixeira, Ana R.
    Tome, Ana Maria
    Lang, E. W.
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2009, PT II, 2009, 5769 : 115 - +