Multi-Resolution Feature Extraction Algorithm in Emotional Speech Recognition

被引:2
|
作者
Zelenik, Ales [1 ]
Kacic, Zdravko [2 ]
机构
[1] NXP Semicond Gratkorn GmbH, A-8101 Gratkorn, Austria
[2] Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia
关键词
Speech; emotion recognition; segmentation; multi-resolution;
D O I
10.5755/j01.eee.21.5.13328
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper a new approach for recognizing emotional speech from audio recordings is presented. In order to obtain the optimum processing window width for feature extraction and to achieve the highest level of recognition rates, a trade-off between time and frequency resolution must be made. At this point, we define a new procedure that combines the advantages of narrower and wider windows and takes advantage of dynamic adjustment of the time and frequency resolution of individual feature characteristics. To achieve higher recognition rates two major procedures are added to the multi-resolution feature-extraction concept, one being the exclusion of features calculated on different processing window widths and the other the idea to use only the parts of recordings with most explicit emotions. To confirm the benefits of the algorithm the audio recordings from the emotional speech database Interface along with four different classifiers were used in evaluation. The highest level of emotion recognition rate with multi-resolution approach exceeded the recognition rate of the best single-resolution approach by 3.5 % with the average improvement of 1.5 % in absolute terms.
引用
收藏
页码:54 / 58
页数:5
相关论文
共 50 条
  • [31] Two-dimensional multi-resolution analysis of speech signals and its application to speech recognition
    Chan, CP
    Wong, YW
    Lee, T
    Ching, PC
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 405 - 408
  • [32] Multi-Resolution Soft Features for Channel-Robust Distributed Speech Recognition
    Ion, Valentin
    Haeb-Umbach, Reinhold
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 805 - 808
  • [33] A Multi-resolution Action Recognition Algorithm using Wavelet Domain Features
    Imtiaz, Hafiz
    Mahbub, Upal
    Schaefer, Gerald
    Ahad, Md. Atiqur Rahman
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 537 - 541
  • [34] Multi-resolution wavelet analysis for chopped impulse voltage measurements and feature extraction
    Onal, Emel
    Kalenderli, Ozcan
    Seker, Serhat
    IEEE TRANSACTIONS ON DIELECTRICS AND ELECTRICAL INSULATION, 2008, 15 (03) : 893 - 900
  • [35] Convolution Neural Network with Multi-Resolution Feature Fusion for Facial Expression Recognition
    He Zhichao
    Zhao Longzhang
    Chen Chuang
    LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (07)
  • [36] LPI Radar Waveform Recognition Based on Multi-Resolution Deep Feature Fusion
    Ni, Xue
    Wang, Huali
    Meng, Fan
    Hu, Jing
    Tong, Changkai
    IEEE ACCESS, 2021, 9 : 26138 - 26146
  • [37] Multi-resolution elongated CS-LDP with Gabor feature for face recognition
    Chen, Xi
    Hu, Fangyuan
    Liu, Zengli
    Huang, Qingsong
    Zhang, Jiashu
    INTERNATIONAL JOURNAL OF BIOMETRICS, 2016, 8 (01) : 19 - 32
  • [38] Speech recognition as feature extraction for speaker recognition
    Stolcke, A.
    Shriberg, E.
    Ferrer, L.
    Kajarekar, S.
    Sonmez, K.
    Tur, G.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 39 - +
  • [39] Multi-resolution speech analysis for automatic speech recognition using deep neural networks: Experiments on TIMIT
    Toledano, Doroteo T.
    Pilar Fernandez-Gallego, Maria
    Lozano-Diez, Alicia
    PLOS ONE, 2018, 13 (10):
  • [40] Optimizing feature extraction for speech recognition
    Lee, CH
    Hyun, DH
    Choi, ES
    Go, JW
    Lee, CY
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2003, 11 (01): : 80 - 87