Optimizing feature extraction for speech recognition

被引:20
|
作者
Lee, CH [1 ]
Hyun, DH [1 ]
Choi, ES [1 ]
Go, JW [1 ]
Lee, CY [1 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul 120749, South Korea
来源
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 01期
关键词
critical band filters; feature extraction; melcepstrum; optimization; speech recognition;
D O I
10.1109/TSA.2002.805644
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a method to minimize the loss of information during the feature extraction stage in speech recognition by optimizing the parameters of the mel-cepstrum transformation, a transform which is widely used in speech recognition. Typically, the mel-cepstrum is obtained by critical band filters whose characteristics play an important role in converting a speech signal into a sequence of vectors. First, we analyze the performance of the mel-cepstrum by changing the parameters of the filters such as shape, center frequency, and bandwidth. Then we propose an algorithm to optimize the parameters of the filters using the simplex method. Experiments with Korean digit words show that the recognition rate improved by about 4-7%.
引用
收藏
页码:80 / 87
页数:8
相关论文
共 50 条
  • [1] Speech recognition as feature extraction for speaker recognition
    Stolcke, A.
    Shriberg, E.
    Ferrer, L.
    Kajarekar, S.
    Sonmez, K.
    Tur, G.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +
  • [2] Feature extraction for robust speech recognition
    Dharanipragada, S
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II, PROCEEDINGS, 2002, : 855 - 858
  • [3] Visual speech feature extraction for improved speech recognition
    Zhang, X
    Mersereau, RM
    Clements, M
    Broun, CC
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 1993 - 1996
  • [4] Optimizing feature extraction for English word recognition
    Choi, E
    Hyun, D
    Lee, C
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 813 - 816
  • [5] Composite Feature Extraction for Speech Emotion Recognition
    Fu, Yangzhi
    Yuan, Xiaochen
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2020), 2020, : 72 - 77
  • [6] Reduced Feature Extraction for Emotional Speech Recognition
    Palo, Hemanta Kumar
    Mohanty, Mihir Narayan
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
  • [7] Sparse KPCA for feature extraction in speech recognition
    Lima, A
    Zen, H
    Nankaku, Y
    Tokuda, K
    Kitamura, T
    Resende, FG
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 353 - 356
  • [8] The application of optimization in feature extraction of speech recognition
    Gu, L
    Liu, RS
    ICSP '96 - 1996 3RD INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, PROCEEDINGS, VOLS I AND II, 1996, : 745 - 748
  • [9] Geometrical feature extraction for robust speech recognition
    Li, Xiaokun
    Kwan, Chiman
    2005 39TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1 AND 2, 2005, : 558 - 562
  • [10] Feature extraction for automatic speech recognition (ASR)
    Swartz, B
    Magotra, N
    THIRTIETH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 1997, : 748 - 751