COMPARISON OF SPEAKER DEPENDENT AND SPEAKER INDEPENDENT EMOTION RECOGNITION

被引:22
|
作者
Rybka, Jan [1 ]
Janicki, Artur [2 ]
机构
[1] Warsaw Univ Technol, Inst Comp Sci, PL-00665 Warsaw, Poland
[2] Warsaw Univ Technol, Inst Telecommun, PL-00665 Warsaw, Poland
关键词
speech processing; emotion recognition; EMO-DB; support vector machines; artificial neural networks; SPEECH;
D O I
10.2478/amcs-2013-0060
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes a study of emotion recognition based on speech analysis. The introduction to the theory contains a review of emotion inventories used in various studies of emotion recognition as well as the speech corpora applied, methods of speech parametrization, and the most commonly employed classification algorithms. In the current study the EMO-DB speech corpus and three selected classifiers, the k-Nearest Neighbor (k-NN), the Artificial Neural Network (ANN) and Support Vector Machines (SVMs), were used in experiments. SVMs turned out to provide the best classification accuracy of 75.44% in the speaker dependent mode, that is, when speech samples from the same speaker were included in the training corpus. Various speaker dependent and speaker independent configurations were analyzed and compared. Emotion recognition in speaker dependent conditions usually yielded higher accuracy results than a similar but speaker independent configuration. The improvement was especially well observed if the base recognition ratio of a given speaker was low. Happiness and anger, as well as boredom and neutrality, proved to be the pairs of emotions most often confused.
引用
收藏
页码:797 / 808
页数:12
相关论文
共 50 条
  • [41] Text-dependent speaker recognition using speaker specific compensation
    Laxman, S
    Sastry, PS
    IEEE TENCON 2003: CONFERENCE ON CONVERGENT TECHNOLOGIES FOR THE ASIA-PACIFIC REGION, VOLS 1-4, 2003, : 384 - 387
  • [42] Emotion interactive robot focus on speaker independently emotion recognition
    Kim, Eun Ho
    Hyun, Kyung Hak
    Kim, Soo Hyun
    Kwak, Yoon Keun
    2007 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS, VOLS 1-3, 2007, : 280 - 285
  • [43] SPEAKER-CONSISTENT PARSING FOR SPEAKER-INDEPENDENT CONTINUOUS SPEECH RECOGNITION
    YAMAGUCHI, K
    SINGER, H
    MATSUNAGA, S
    SAGAYAMA, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (06) : 719 - 724
  • [44] Speaker adaptation techniques for speech recognition with a speaker-independent phonetic recognizer
    Kim, WG
    Jang, M
    COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 95 - 100
  • [45] Multimodal Emotion Recognition Based on the Decoupling of Emotion and Speaker Information
    Gajsek, Rok
    Struc, Vitomir
    Mihelic, France
    TEXT, SPEECH AND DIALOGUE, 2010, 6231 : 275 - 282
  • [46] Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features
    Qi-rong Mao
    Xiao-lei Zhao
    Zheng-wei Huang
    Yong-zhao Zhan
    Journal of Zhejiang University SCIENCE C, 2013, 14 : 573 - 582
  • [47] Speaker independent feature selection for speech emotion recognition: A multi-task approach
    Elham Kalhor
    Behzad Bakhtiari
    Multimedia Tools and Applications, 2021, 80 : 8127 - 8146
  • [48] On the relevance of high-level features for speaker independent emotion recognition of spontaneous speech
    Lugger, Marko
    Yang, Bin
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1959 - 1962
  • [49] Speaker independent feature selection for speech emotion recognition: A multi-task approach
    Kalhor, Elham
    Bakhtiari, Behzad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) : 8127 - 8146
  • [50] Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features
    Qi-rong MAO
    Xiao-lei ZHAO
    Zheng-wei HUANG
    Yong-zhao ZHAN
    Frontiers of Information Technology & Electronic Engineering, 2013, 14 (07) : 573 - 582