Optimizing feature extraction for speech recognition

被引：20

作者：

Lee, CH ^{[1
]}

Hyun, DH ^{[1
]}

Choi, ES ^{[1
]}

Go, JW ^{[1
]}

Lee, CY ^{[1
]}

机构：

[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul 120749, South Korea

来源：

IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING | 2003年 / 11卷 / 01期

关键词：

critical band filters; feature extraction; melcepstrum; optimization; speech recognition;

D O I：

10.1109/TSA.2002.805644

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a method to minimize the loss of information during the feature extraction stage in speech recognition by optimizing the parameters of the mel-cepstrum transformation, a transform which is widely used in speech recognition. Typically, the mel-cepstrum is obtained by critical band filters whose characteristics play an important role in converting a speech signal into a sequence of vectors. First, we analyze the performance of the mel-cepstrum by changing the parameters of the filters such as shape, center frequency, and bandwidth. Then we propose an algorithm to optimize the parameters of the filters using the simplex method. Experiments with Korean digit words show that the recognition rate improved by about 4-7%.

引用

页码：80 / 87

页数：8

共 50 条

[41] NON-STATIONARY FEATURE EXTRACTION FOR AUTOMATIC SPEECH RECOGNITION
Tueske, Zoltan
Golik, Pavel
Schlueter, Ralf
Drepper, Friedhelm R.
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5204 - 5207
[42] Temporal modulation normalization for robust speech feature extraction and recognition
Xugang Lu
Shigeki Matsuda
Masashi Unoki
Satoshi Nakamura
Multimedia Tools and Applications, 2011, 52 : 187 - 199
[43] Information Extraction and Noisy Feature Pruning for Mandarin Speech Recognition
Gao, Guozhi
Duan, Zhikui
Yang, Guangguang
Li, Shiren
Yu, Xinmei
Zhao, Xiaomeng
Ruan, Jinbiao
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2024, 72 (1-2): : 59 - 70
[44] A Correlational Discriminant Approach to Feature Extraction for Robust Speech Recognition
Tomar, Vikrant Singh
Rose, Richard C.
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 554 - 557
[45] Physiologically Motivated Feature Extraction for Robust Automatic Speech Recognition
Missaoui, Ibrahim
Lachiri, Zied
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (04) : 297 - 301
[46] Feature extraction algorithms to improve the speech emotion recognition rate
Anusha Koduru
Hima Bindu Valiveti
Anil Kumar Budati
International Journal of Speech Technology, 2020, 23 : 45 - 55
[47] Feature extraction for HMM speech recognition systems using DTW
Go, J
Hyun, D
Lee, C
6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL III, PROCEEDINGS: IMAGE, ACOUSTIC, SPEECH AND SIGNAL PROCESSING I, 2002, : 241 - 244
[48] Discriminative tonal feature extraction method in mandarin speech recognition
HUANG Hao
The Journal of China Universities of Posts and Telecommunications, 2007, (04) : 126 - 130
[49] Speech Gender Recognition Using a Multilayer Feature Extraction Method
Abdulmohsin, Husam Ali
Al-Khateeb, Belal
Hasan, Samer Sami
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 113 - 122
[50] A bio-inspired feature extraction for robust speech recognition
Zouhir, Youssef
Ouni, Kais
SPRINGERPLUS, 2014, 3

← 1 2 3 4 5 →