Multi-Resolution Feature Extraction Algorithm in Emotional Speech Recognition

被引：2

作者：

Zelenik, Ales ^{[1
]}

Kacic, Zdravko ^{[2
]}

机构：

[1] NXP Semicond Gratkorn GmbH, A-8101 Gratkorn, Austria

[2] Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia

来源：

ELEKTRONIKA IR ELEKTROTECHNIKA | 2015年 / 21卷 / 05期

关键词：

Speech; emotion recognition; segmentation; multi-resolution;

D O I：

10.5755/j01.eee.21.5.13328

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper a new approach for recognizing emotional speech from audio recordings is presented. In order to obtain the optimum processing window width for feature extraction and to achieve the highest level of recognition rates, a trade-off between time and frequency resolution must be made. At this point, we define a new procedure that combines the advantages of narrower and wider windows and takes advantage of dynamic adjustment of the time and frequency resolution of individual feature characteristics. To achieve higher recognition rates two major procedures are added to the multi-resolution feature-extraction concept, one being the exclusion of features calculated on different processing window widths and the other the idea to use only the parts of recordings with most explicit emotions. To confirm the benefits of the algorithm the audio recordings from the emotional speech database Interface along with four different classifiers were used in evaluation. The highest level of emotion recognition rate with multi-resolution approach exceeded the recognition rate of the best single-resolution approach by 3.5 % with the average improvement of 1.5 % in absolute terms.

引用

页码：54 / 58

页数：5

共 50 条

[31] Two-dimensional multi-resolution analysis of speech signals and its application to speech recognition
Chan, CP
Wong, YW
Lee, T
Ching, PC
ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 405 - 408
[32] Multi-Resolution Soft Features for Channel-Robust Distributed Speech Recognition
Ion, Valentin
Haeb-Umbach, Reinhold
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 805 - 808
[33] A Multi-resolution Action Recognition Algorithm using Wavelet Domain Features
Imtiaz, Hafiz
Mahbub, Upal
Schaefer, Gerald
Ahad, Md. Atiqur Rahman
2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 537 - 541
[34] Multi-resolution wavelet analysis for chopped impulse voltage measurements and feature extraction
Onal, Emel
Kalenderli, Ozcan
Seker, Serhat
IEEE TRANSACTIONS ON DIELECTRICS AND ELECTRICAL INSULATION, 2008, 15 (03) : 893 - 900
[35] Convolution Neural Network with Multi-Resolution Feature Fusion for Facial Expression Recognition
He Zhichao
Zhao Longzhang
Chen Chuang
LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (07)
[36] LPI Radar Waveform Recognition Based on Multi-Resolution Deep Feature Fusion
Ni, Xue
Wang, Huali
Meng, Fan
Hu, Jing
Tong, Changkai
IEEE ACCESS, 2021, 9 : 26138 - 26146
[37] Multi-resolution elongated CS-LDP with Gabor feature for face recognition
Chen, Xi
Hu, Fangyuan
Liu, Zengli
Huang, Qingsong
Zhang, Jiashu
INTERNATIONAL JOURNAL OF BIOMETRICS, 2016, 8 (01) : 19 - 32
[38] Speech recognition as feature extraction for speaker recognition
Stolcke, A.
Shriberg, E.
Ferrer, L.
Kajarekar, S.
Sonmez, K.
Tur, G.
2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +
[39] Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT
Toledano, Doroteo T.
Pilar Fernandez-Gallego, Maria
Lozano-Diez, Alicia
PLOS ONE, 2018, 13 (10):
[40] Optimizing feature extraction for speech recognition
Lee, CH
Hyun, DH
Choi, ES
Go, JW
Lee, CY
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01): : 80 - 87

← 1 2 3 4 5 →