Optimizing feature extraction for speech recognition

被引:20
|
作者
Lee, CH [1 ]
Hyun, DH [1 ]
Choi, ES [1 ]
Go, JW [1 ]
Lee, CY [1 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul 120749, South Korea
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 01期
关键词
critical band filters; feature extraction; melcepstrum; optimization; speech recognition;
D O I
10.1109/TSA.2002.805644
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a method to minimize the loss of information during the feature extraction stage in speech recognition by optimizing the parameters of the mel-cepstrum transformation, a transform which is widely used in speech recognition. Typically, the mel-cepstrum is obtained by critical band filters whose characteristics play an important role in converting a speech signal into a sequence of vectors. First, we analyze the performance of the mel-cepstrum by changing the parameters of the filters such as shape, center frequency, and bandwidth. Then we propose an algorithm to optimize the parameters of the filters using the simplex method. Experiments with Korean digit words show that the recognition rate improved by about 4-7%.
引用
收藏
页码:80 / 87
页数:8
相关论文
共 50 条
  • [21] Tandem connectionist feature extraction for conversational speech recognition
    Zhu, QF
    Chen, B
    Morgan, N
    Stolcke, A
    MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2005, 3361 : 223 - 231
  • [22] Applying feature extraction of speech recognition on VOIP auditing
    Wang, Xuan
    Lin, Jiancheng
    Sun, Yong
    2007 THIRD INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, VOL 1, PROCEEDINGS, 2007, : 237 - +
  • [23] Modified feature extraction methods in robust speech recognition
    Rajnoha, Josef
    Pollak, Petr
    2007 17TH INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, VOLS 1 AND 2, 2007, : 337 - +
  • [24] Feature Extraction Analysis on Indonesian Speech Recognition System
    Wisesty, Untari N.
    Adiwijaya
    Astuti, Widi
    2015 3rd International Conference on Information and Communication Technology (ICoICT), 2015, : 54 - 58
  • [25] Applying sparse KPCA for feature extraction in speech recognition
    Lima, A
    Zen, H
    Nankaku, Y
    Tokuda, K
    Kitamura, T
    Resende, FG
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03): : 401 - 409
  • [26] Discriminative temporal feature extraction for robust speech recognition
    Shen, JL
    ELECTRONICS LETTERS, 1997, 33 (19) : 1598 - 1600
  • [27] Soft Margin Feature Extraction for Automatic Speech Recognition
    Li, Jinyu
    Lee, Chin-Hui
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 293 - 296
  • [28] A Salient Feature Extraction Algorithm for Speech Emotion Recognition
    Liang, Ruiyu
    Tao, Huawei
    Tang, Guichen
    Wang, Qingyun
    Zhao, Li
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (09): : 1715 - 1718
  • [29] APPLYING FEATURE EXTRACTION OF SPEECH RECOGNITION ON VOIP AUDITING
    Wang, Xuan
    Lin, Jiancheng
    Sun, Yong
    Gan, Haibo
    Yao, Lin
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (07): : 1851 - 1856
  • [30] On the use of kernel PCA for feature extraction in speech recognition
    Lima, A
    Zen, H
    Nankaku, Y
    Miyajima, C
    Tokuda, K
    Kitamura, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (12) : 2802 - 2811